How to make predictions on TensorFlow's Wide and Deep model loaded in TensorFlow Servings model_server - tensorflow

Can someone assist me in making predictions on TensorFlow's Wide and Deep Learning model loaded into TensorFlow Serving's model_server?
If anyone could point me to a resource or documentation for the same would be really helpful.

You can possibly try to invoke the predict method of the estimator and set the as_iterable as false for an ndarray
y = m.predict(input_fn=lambda: input_fn(df_test), as_iterable=False)
However, note the deprecation note here for future compatibility.

If your model is exported using Estimator.export_savedmodel() and you successfully built TensorFlow Serving itself, you can do something like this:
from grpc.beta import implementations
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2
tf.app.flags.DEFINE_string('server', 'localhost:9000', 'Server host:port.')
tf.app.flags.DEFINE_string('model', 'wide_and_deep', 'Model name.')
FLAGS = tf.app.flags.FLAGS
...
def main(_):
host, port = FLAGS.server.split(':')
# Set up a connection to the TF Model Server
channel = implementations.insecure_channel(host, int(port))
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
# Create a request that will be sent for an inference
request = predict_pb2.PredictRequest()
request.model_spec.name = FLAGS.model
request.model_spec.signature_name = 'serving_default'
# A single tf.Example that will get serialized and turned into a TensorProto
feature_dict = {'age': _float_feature(value=25),
'capital_gain': _float_feature(value=0),
'capital_loss': _float_feature(value=0),
'education': _bytes_feature(value='11th'.encode()),
'education_num': _float_feature(value=7),
'gender': _bytes_feature(value='Male'.encode()),
'hours_per_week': _float_feature(value=40),
'native_country': _bytes_feature(value='United-States'.encode()),
'occupation': _bytes_feature(value='Machine-op-inspct'.encode()),
'relationship': _bytes_feature(value='Own-child'.encode()),
'workclass': _bytes_feature(value='Private'.encode())}
label = 0
example = tf.train.Example(features=tf.train.Features(feature=feature_dict))
serialized = example.SerializeToString()
request.inputs['inputs'].CopyFrom(
tf.contrib.util.make_tensor_proto(serialized, shape=[1]))
# Create a future result, and set 5 seconds timeout
result_future = stub.Predict.future(request, 5.0)
prediction = result_future.result().outputs['scores']
print('True label: ' + str(label))
print('Prediction: ' + str(np.argmax(prediction)))
Here I wrote a simple tutorial Exporting and Serving a TensorFlow Wide & Deep Model with more details.
Hope it helps.

Related

How to use mlflow to deploy model that requires tensorflow_text for bert on local machine?

I recently use mlflow 1.29.0 to track my model training, I use BERT for text embedding which need to import tensorflow_text to register op before training, here is an example:
import tensorflow_hub as hub
import tensorflow_text as text
def create_model():
text_input = tf.keras.layers.Input(shape = (),dtype = tf.string, name = 'text_input')
preprocessed_text = preprocess_model(text_input)
encoder_text = encoder_model(preprocessed_text)['pooled_output']
text_output = tf.keras.layers.Dropout(0.1,name = 'dropout1')(encoder_text)
text_output = tf.keras.layers.Dense(units = 400, activation = tf.keras.activations.sigmoid, name = 'text_dense1')(text_output)
text_output = tf.keras.layers.Dropout(0.1,name = 'dropout2')(text_output)
final_output = tf.keras.layers.Dense(units = 1, activation = tf.keras.activations.sigmoid,name = 'output')(text_output)
model = tf.keras.Model(inputs = [text_input],outputs = [final_output])
return model
if __name__ == '__main__':
preprocess_path = 'https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3'
encoder_path = 'https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/4'
preprocess_model = hub.KerasLayer(preprocess_path)
encoder_model = hub.KerasLayer(encoder_path)
with mlflow.start_run() as run:
model = create_model()
model.fit(...)
mlflow.keras.log_model(keras_model = model,...)
mlflow.end_run()
The code run successfully, and the mlflow ui showed everything, however, when I start to deploy the data on my local machine with the following command
mlflow sagemaker build-and-push-container
mlflow sagemaker run-local -m runs:/XXXXX/XXXX -p 4999
it showed the following error:
FileNotFoundError: Op type not registered 'CaseFoldUTF8' in binary running on mighty. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) tf.contrib.resampler should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
I think it's because the tensorflow_text need to be import before running the model. (based on mlflow, the conda.yaml contain tensorflow-text==2.3.0)
I've meet this error several times when I train the model, I put 'import tensorflow_text as text' at top and the problem was fixed.
However I'm not quite sure how to do that when I deploy the model locally, can anyone help me with that? thank you!
I tried other command like mlflow models serve -m runs:/XXXXX/XXXX -p 4999, and the error is still there.

Op type not registered \'IO>BigQueryClient\' with BigQuery connector on AI platform

I'm trying to parallelize the training step of my model with tensorflow ParameterServerStrategy. I work with GCP AI Platform to create the cluster and launch the task.
As my dataset is huge, I use the bigquery tensorflow connector included in tensorflow-io.
My script is inspired by the documentation of tensorflow bigquery reader and the documentation of tensorflow ParameterServerStrategy
Locally my script works well but when I launch it with AI Platform I get the following error :
{"created":"#1633444428.903993309","description":"Error received from peer ipv4:10.46.92.135:2222","file":"external/com_github_grpc_grpc/src/core/lib/surface/call.cc","file_line":1056,"grpc_message":"Op type not registered \'IO>BigQueryClient\' in binary running on gke-cml-1005-141531--n1-standard-16-2-644bc3f8-7h8p. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.","grpc_status":5}
The scripts works with fake data on AI platform and works locally with bigquery connector.
I imagine that the compilation of the model including the bigquery connector and its calls on other devices creates the bug but I don't know how to fix it.
I read this error happens when devices don't have same tensorflow versions so I checked tensorflow and tensorflow-io version on each device.
tensorflow : 2.5.0
tensorflow-io : 0.19.1
I created a similar example which reproduce the bug on AI platform
import os
from tensorflow_io.bigquery import BigQueryClient
from tensorflow_io.bigquery import BigQueryReadSession
import tensorflow as tf
import multiprocessing
import portpicker
from tensorflow.keras.layers.experimental import preprocessing
from google.cloud import bigquery
from tensorflow.python.framework import dtypes
import numpy as np
import pandas as pd
client = bigquery.Client()
PROJECT_ID = <your_project>
DATASET_ID = 'tmp'
TABLE_ID = 'bq_tf_io'
BATCH_SIZE = 32
# Bigquery requirements
def init_bq_table():
table = '%s.%s.%s' %(PROJECT_ID, DATASET_ID, TABLE_ID)
# Create toy_data
def create_toy_data(N):
x = np.random.random(size = N)
y = 0.2 + x + np.random.normal(loc=0, scale = 0.3, size = N)
return x, y
x, y =create_toy_data(1000)
df = pd.DataFrame(data = {'x': x, 'y': y})
job_config = bigquery.LoadJobConfig(write_disposition="WRITE_TRUNCATE",)
job = client.load_table_from_dataframe( df, table, job_config=job_config )
job.result()
# Create initial data
#init_bq_table()
CSV_SCHEMA = [
bigquery.SchemaField("x", "FLOAT64"),
bigquery.SchemaField("y", "FLOAT64"),
]
def transform_row(row_dict):
# Trim all string tensors
dataset_x = row_dict
dataset_x['constant'] = tf.cast(1, tf.float64)
# Extract feature column
dataset_y = dataset_x.pop('y')
#Export as tensor
dataset_x = tf.stack([dataset_x[column] for column in dataset_x], axis=-1)
return (dataset_x, dataset_y)
def read_bigquery(table_name):
tensorflow_io_bigquery_client = BigQueryClient()
read_session = tensorflow_io_bigquery_client.read_session(
"projects/" + PROJECT_ID,
PROJECT_ID, TABLE_ID, DATASET_ID,
list(field.name for field in CSV_SCHEMA),
list(dtypes.double if field.field_type == 'FLOAT64'
else dtypes.string for field in CSV_SCHEMA),
requested_streams=2)
dataset = read_session.parallel_read_rows()
return dataset
def get_data():
dataset = read_bigquery(TABLE_ID)
dataset = dataset.map(transform_row, num_parallel_calls=4)
dataset = dataset.batch(BATCH_SIZE).prefetch(2)
return dataset
cluster_resolver = tf.distribute.cluster_resolver.TFConfigClusterResolver()
# parameter server and worker just wait jobs from the coordinator (chief)
if cluster_resolver.task_type in ("worker"):
worker_config = tf.compat.v1.ConfigProto()
server = tf.distribute.Server(
cluster_resolver.cluster_spec(),
job_name=cluster_resolver.task_type,
task_index=cluster_resolver.task_id,
config=worker_config,
protocol="grpc")
server.join()
elif cluster_resolver.task_type in ("ps"):
server = tf.distribute.Server(
cluster_resolver.cluster_spec(),
job_name=cluster_resolver.task_type,
task_index=cluster_resolver.task_id,
protocol="grpc")
server.join()
elif cluster_resolver.task_type == 'chief':
strategy = tf.distribute.experimental.ParameterServerStrategy(cluster_resolver=cluster_resolver)
if cluster_resolver.task_type == 'chief':
learning_rate = 0.01
with strategy.scope():
# model
model_input = tf.keras.layers.Input(
shape=(2,), dtype=tf.float64)
layer_1 = tf.keras.layers.Dense( 8, activation='relu')(model_input)
dense_output = tf.keras.layers.Dense(1)(layer_1)
model = tf.keras.Model(model_input, dense_output)
#optimizer
optimizer=tf.keras.optimizers.SGD(learning_rate=learning_rate)
accuracy = tf.keras.metrics.MeanSquaredError()
#tf.function
def distributed_train_step(iterator):
def train_step(x_batch_train, y_batch_train):
with tf.GradientTape() as tape:
y_predict = model(x_batch_train, training=True)
loss_value = tf.keras.losses.MeanSquaredError(reduction=tf.keras.losses.Reduction.NONE)(y_batch_train, y_predict)
grads = tape.gradient(loss_value, model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
accuracy.update_state(y_batch_train, y_predict)
return loss_value
x_batch_train, y_batch_train = next(iterator)
return strategy.run(train_step, args=(x_batch_train, y_batch_train))
coordinator = tf.distribute.experimental.coordinator.ClusterCoordinator(strategy)
#test
def dataset_fn(_):
def create_toy_data(N):
x = np.random.random(size = N)
y = 0.2 + x + np.random.normal(loc=0, scale = 0.3, size = N)
return np.c_[x,y]
def toy_transform_row(row):
dataset_x = tf.stack([row[0], tf.cast(1, tf.float64)], axis=-1)
dataset_y = row[1]
return dataset_x, dataset_y
N = 1000
data =create_toy_data(N)
dataset = tf.data.Dataset.from_tensor_slices(data)
dataset = dataset.map(toy_transform_row, num_parallel_calls=4)
dataset = dataset.batch(BATCH_SIZE)
dataset = dataset.prefetch(2)
return dataset
#tf.function
def per_worker_dataset_fn():
return strategy.distribute_datasets_from_function(lambda x : get_data()) # <-- Not working with AI platform
#return strategy.distribute_datasets_from_function(dataset_fn) # <-- Working with AI platform
per_worker_dataset = coordinator.create_per_worker_dataset(per_worker_dataset_fn)
# Train model
for epoch in range(5):
per_worker_iterator = iter(per_worker_dataset)
accuracy.reset_states()
for step in range(5):
coordinator.schedule(distributed_train_step, args=(per_worker_iterator,))
coordinator.join()
print ("Finished epoch %d, accuracy is %f." % (epoch, accuracy.result().numpy()))
When I create the dataset with per_worker_dataset_fn() I can use the bigquery connector (bugging) or create the dataset in live (working).
AI Platform Cluster configuration :
runtimeVersion: "2.5"
pythonVersion: "3.7"
Did someone get this issue ? Bigquery connector worked pretty well with MirroredStrategy on AI Platform. Tell me if I should report the issue somewhere else.
I think this is due to lazy loading of libtensorflow_io.so.
https://github.com/tensorflow/io/commit/85d018ee59ceccfae06914ec2a2f6d6583775ff7
Can you try adding something like this to your code:
import tensorflow_io
tensorflow_io.experimental.oss()
As far as I understand this happens because when you submit your training job to Cloud AI training, it is using a stock TensorFlow 2.5 environment that doesn't have tensorflow-io package installed. Therefore it is complaining that it doesn't know about 'IO>BigQueryClient' op defined in tensorflow-io package.
Instead you can submit your training job to be using a custom container:
https://cloud.google.com/ai-platform/training/docs/custom-containers-training
You don't need to write a new Docker file, you can use
gcr.io/deeplearning-platform-release/tf-cpu.2-5
or
gcr.io/deeplearning-platform-release/tf-gpu.2-5 (if your training job needs GPU) that has the right version of tensorflow-io installed.
You can read more about these containers here:
https://cloud.google.com/tensorflow-enterprise/docs/use-with-deep-learning-containers
Here is my old example showing how to run a distributed training on Cloud AI using BigQueryReader: https://github.com/vlasenkoalexey/criteo/blob/master/scripts/train-cloud.sh
It is no longer maintained, but should give you a general idea how it should look like.

How to import an saved Tensorflow model train using tf.estimator and predict on input data

I have save the model using tf.estimator .method export_savedmodel as follows:
export_dir="exportModel/"
feature_spec = tf.feature_column.make_parse_example_spec(feature_columns)
input_receiver_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
classifier.export_savedmodel(export_dir, input_receiver_fn, as_text=False, checkpoint_path="Model/model.ckpt-400")
How can I import this saved model and use for predictions?
I tried to search for a good base example, but it appears the documentation and samples are a bit scattered for this topic. So let's start with a base example: the tf.estimator quickstart.
That particular example doesn't actually export a model, so let's do that (not need for use case 1):
def serving_input_receiver_fn():
"""Build the serving inputs."""
# The outer dimension (None) allows us to batch up inputs for
# efficiency. However, it also means that if we want a prediction
# for a single instance, we'll need to wrap it in an outer list.
inputs = {"x": tf.placeholder(shape=[None, 4], dtype=tf.float32)}
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
export_dir = classifier.export_savedmodel(
export_dir_base="/path/to/model",
serving_input_receiver_fn=serving_input_receiver_fn)
Huge asterisk on this code: there appears to be a bug in TensorFlow 1.3 that doesn't allow you to do the above export on a "canned" estimator (such as DNNClassifier). For a workaround, see the "Appendix: Workaround" section.
The code below references export_dir (return value from the export step) to emphasize that it is not "/path/to/model", but rather, a subdirectory of that directory whose name is a timestamp.
Use Case 1: Perform prediction in the same process as training
This is an sci-kit learn type of experience, and is already exemplified by the sample. For completeness' sake, you simply call predict on the trained model:
classifier.train(input_fn=train_input_fn, steps=2000)
# [...snip...]
predictions = list(classifier.predict(input_fn=predict_input_fn))
predicted_classes = [p["classes"] for p in predictions]
Use Case 2: Load a SavedModel into Python/Java/C++ and perform predictions
Python Client
Perhaps the easiest thing to use if you want to do prediction in Python is SavedModelPredictor. In the Python program that will use the SavedModel, we need code like this:
from tensorflow.contrib import predictor
predict_fn = predictor.from_saved_model(export_dir)
predictions = predict_fn(
{"x": [[6.4, 3.2, 4.5, 1.5],
[5.8, 3.1, 5.0, 1.7]]})
print(predictions['scores'])
Java Client
package dummy;
import java.nio.FloatBuffer;
import java.util.Arrays;
import java.util.List;
import org.tensorflow.SavedModelBundle;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
public class Client {
public static void main(String[] args) {
Session session = SavedModelBundle.load(args[0], "serve").session();
Tensor x =
Tensor.create(
new long[] {2, 4},
FloatBuffer.wrap(
new float[] {
6.4f, 3.2f, 4.5f, 1.5f,
5.8f, 3.1f, 5.0f, 1.7f
}));
// Doesn't look like Java has a good way to convert the
// input/output name ("x", "scores") to their underlying tensor,
// so we hard code them ("Placeholder:0", ...).
// You can inspect them on the command-line with saved_model_cli:
//
// $ saved_model_cli show --dir $EXPORT_DIR --tag_set serve --signature_def serving_default
final String xName = "Placeholder:0";
final String scoresName = "dnn/head/predictions/probabilities:0";
List<Tensor> outputs = session.runner()
.feed(xName, x)
.fetch(scoresName)
.run();
// Outer dimension is batch size; inner dimension is number of classes
float[][] scores = new float[2][3];
outputs.get(0).copyTo(scores);
System.out.println(Arrays.deepToString(scores));
}
}
C++ Client
You'll likely want to use tensorflow::LoadSavedModel with Session.
#include <unordered_set>
#include <utility>
#include <vector>
#include "tensorflow/cc/saved_model/loader.h"
#include "tensorflow/core/framework/tensor.h"
#include "tensorflow/core/public/session.h"
namespace tf = tensorflow;
int main(int argc, char** argv) {
const string export_dir = argv[1];
tf::SavedModelBundle bundle;
tf::Status load_status = tf::LoadSavedModel(
tf::SessionOptions(), tf::RunOptions(), export_dir, {"serve"}, &bundle);
if (!load_status.ok()) {
std::cout << "Error loading model: " << load_status << std::endl;
return -1;
}
// We should get the signature out of MetaGraphDef, but that's a bit
// involved. We'll take a shortcut like we did in the Java example.
const string x_name = "Placeholder:0";
const string scores_name = "dnn/head/predictions/probabilities:0";
auto x = tf::Tensor(tf::DT_FLOAT, tf::TensorShape({2, 4}));
auto matrix = x.matrix<float>();
matrix(0, 0) = 6.4;
matrix(0, 1) = 3.2;
matrix(0, 2) = 4.5;
matrix(0, 3) = 1.5;
matrix(0, 1) = 5.8;
matrix(0, 2) = 3.1;
matrix(0, 3) = 5.0;
matrix(0, 4) = 1.7;
std::vector<std::pair<string, tf::Tensor>> inputs = {{x_name, x}};
std::vector<tf::Tensor> outputs;
tf::Status run_status =
bundle.session->Run(inputs, {scores_name}, {}, &outputs);
if (!run_status.ok()) {
cout << "Error running session: " << run_status << std::endl;
return -1;
}
for (const auto& tensor : outputs) {
std::cout << tensor.matrix<float>() << std::endl;
}
}
Use Case 3: Serve a model using TensorFlow Serving
Exporting models in a manner amenable to serving a Classification model requires that the input be a tf.Example object. Here's how we might export a model for TensorFlow serving:
def serving_input_receiver_fn():
"""Build the serving inputs."""
# The outer dimension (None) allows us to batch up inputs for
# efficiency. However, it also means that if we want a prediction
# for a single instance, we'll need to wrap it in an outer list.
example_bytestring = tf.placeholder(
shape=[None],
dtype=tf.string,
)
features = tf.parse_example(
example_bytestring,
tf.feature_column.make_parse_example_spec(feature_columns)
)
return tf.estimator.export.ServingInputReceiver(
features, {'examples': example_bytestring})
export_dir = classifier.export_savedmodel(
export_dir_base="/path/to/model",
serving_input_receiver_fn=serving_input_receiver_fn)
The reader is referred to TensorFlow Serving's documentation for more instructions on how to setup TensorFlow Serving, so I'll only provide the client code here:
# Omitting a bunch of connection/initialization code...
# But at some point we end up with a stub whose lifecycle
# is generally longer than that of a single request.
stub = create_stub(...)
# The actual values for prediction. We have two examples in this
# case, each consisting of a single, multi-dimensional feature `x`.
# This data here is the equivalent of the map passed to the
# `predict_fn` in use case #2.
examples = [
tf.train.Example(
features=tf.train.Features(
feature={"x": tf.train.Feature(
float_list=tf.train.FloatList(value=[6.4, 3.2, 4.5, 1.5]))})),
tf.train.Example(
features=tf.train.Features(
feature={"x": tf.train.Feature(
float_list=tf.train.FloatList(value=[5.8, 3.1, 5.0, 1.7]))})),
]
# Build the RPC request.
predict_request = predict_pb2.PredictRequest()
predict_request.model_spec.name = "default"
predict_request.inputs["examples"].CopyFrom(
tensor_util.make_tensor_proto(examples, tf.float32))
# Perform the actual prediction.
stub.Predict(request, PREDICT_DEADLINE_SECS)
Note that the key, examples, that is referenced in the predict_request.inputs needs to match the key used in the serving_input_receiver_fn at export time (cf. the constructor to ServingInputReceiver in that code).
Appendix: Working around Exports from Canned Models in TF 1.3
There appears to be a bug in TensorFlow 1.3 in which canned models do not export properly for use case 2 (the problem does not exist for "custom" estimators). Here's is a workaround that wraps a DNNClassifier to make things work, specifically for the Iris example:
# Build 3 layer DNN with 10, 20, 10 units respectively.
class Wrapper(tf.estimator.Estimator):
def __init__(self, **kwargs):
dnn = tf.estimator.DNNClassifier(**kwargs)
def model_fn(mode, features, labels):
spec = dnn._call_model_fn(features, labels, mode)
export_outputs = None
if spec.export_outputs:
export_outputs = {
"serving_default": tf.estimator.export.PredictOutput(
{"scores": spec.export_outputs["serving_default"].scores,
"classes": spec.export_outputs["serving_default"].classes})}
# Replace the 3rd argument (export_outputs)
copy = list(spec)
copy[4] = export_outputs
return tf.estimator.EstimatorSpec(mode, *copy)
super(Wrapper, self).__init__(model_fn, kwargs["model_dir"], dnn.config)
classifier = Wrapper(feature_columns=feature_columns,
hidden_units=[10, 20, 10],
n_classes=3,
model_dir="/tmp/iris_model")
I dont think there is a bug with canned Estimators (or rather if there was ever one, it has been fixed). I was able to successfully export a canned estimator model using Python and import it in Java.
Here is my code to export the model:
a = tf.feature_column.numeric_column("a");
b = tf.feature_column.numeric_column("b");
feature_columns = [a, b];
model = tf.estimator.DNNClassifier(feature_columns=feature_columns ...);
# To export
feature_spec = tf.feature_column.make_parse_example_spec(feature_columns);
export_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec);
servable_model_path = model.export_savedmodel(servable_model_dir, export_input_fn, as_text=True);
To import the model in Java, I used the Java client code provided by rhaertel80 above and it works. Hope this also answers Ben Fowler's question above.
It appears that the TensorFlow team does not agree that there is a bug in version 1.3 using canned estimators for exporting a model under use case #2. I submitted a bug report here:
https://github.com/tensorflow/tensorflow/issues/13477
The response I received from TensorFlow is that the input must only be a single string tensor. It appears that there may be a way to consolidate multiple features into a single string tensor using serialized TF.examples, but I have not found a clear method to do this. If anyone has code showing how to do this, I would be appreciative.
You need to export the saved model using tf.contrib.export_savedmodel and you need to define input receiver function to pass input to.
Later you can load the saved model ( generally saved.model.pb) from the disk and serve it.
TensorFlow: How to predict from a SavedModel?

How to use Tensorflow

I've built multiple DNN and conVNN using tensorflow, and I can reach now a good accuracy. Now my question is how can I use this trained networks in real example.
I case of a convNN for computer vision, how can I use the model to classify a new picture ? can I generate something like convNN.exe that get images as input parameter that through the classification result out ?
Once you've trained the model, you should save it somewhere by adding code similar to
builder = saved_model_builder.SavedModelBuilder(export_path)
builder.add_meta_graph_and_variables(
sess, [tag_constants.SERVING],
signature_def_map={
'predict_images':
prediction_signature,
signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
classification_signature,
},
legacy_init_op=legacy_init_op)
builder.save()
Then, you can use Tensorflow serving to serve your model using a high-performance C++ server by running
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server \
--port=9000 --model_name=mnist \
--model_base_path=/tmp/mnist_model/
Modifying the code for your model, of course. You'll need to implement a client; there's an example for MNIST here. The guts of the client would be something like:
def do_inference(hostport, work_dir, concurrency, num_tests):
"""Tests PredictionService with concurrent requests.
Args:
hostport: Host:port address of the PredictionService.
work_dir: The full path of working directory for test data set.
concurrency: Maximum number of concurrent requests.
num_tests: Number of test images to use.
Returns:
The classification error rate.
Raises:
IOError: An error occurred processing test data set.
"""
test_data_set = mnist_input_data.read_data_sets(work_dir).test
host, port = hostport.split(':')
channel = implementations.insecure_channel(host, int(port))
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
result_counter = _ResultCounter(num_tests, concurrency)
for _ in range(num_tests):
request = predict_pb2.PredictRequest()
request.model_spec.name = 'mnist'
request.model_spec.signature_name = 'predict_images'
image, label = test_data_set.next_batch(1)
request.inputs['images'].CopyFrom(
tf.contrib.util.make_tensor_proto(image[0], shape=[1, image[0].size]))
result_counter.throttle()
result_future = stub.Predict.future(request, 5.0) # 5 seconds
result_future.add_done_callback(
_create_rpc_callback(label[0], result_counter))
return result_counter.get_error_rate()
def main(_):
if FLAGS.num_tests > 10000:
print('num_tests should not be greater than 10k')
return
if not FLAGS.server:
print('please specify server host:port')
return
error_rate = do_inference(FLAGS.server, FLAGS.work_dir,
FLAGS.concurrency, FLAGS.num_tests)
print('\nInference error rate: %s%%' % (error_rate * 100))
if __name__ == '__main__':
tf.app.run()
This is in Python, of course, but there's no reason you couldn't use another language (e.g. Go or C++) if you wanted to create a binary executable.

"Output 0 of type double does not match declared output type string" while running the iris sample program in TensorFlow Serving

I am running the sample iris program in TensorFlow Serving. Since it is a TF.Learn model, I am exporting the model using the following classifier.export(export_dir=model_dir,signature_fn=my_classification_signature_fn) and the signature_fn is defined as shown below:
def my_classification_signature_fn(examples, unused_features, predictions):
"""Creates classification signature from given examples and predictions.
Args:
examples: `Tensor`.
unused_features: `dict` of `Tensor`s.
predictions: `Tensor` or dict of tensors that contains the classes tensor
as in {'classes': `Tensor`}.
Returns:
Tuple of default classification signature and empty named signatures.
Raises:
ValueError: If examples is `None`.
"""
if examples is None:
raise ValueError('examples cannot be None when using this signature fn.')
if isinstance(predictions, dict):
default_signature = exporter.classification_signature(
examples, classes_tensor=predictions['classes'])
else:
default_signature = exporter.classification_signature(
examples, classes_tensor=predictions)
named_graph_signatures={
'inputs': exporter.generic_signature({'x_values': examples}),
'outputs': exporter.generic_signature({'preds': predictions})}
return default_signature, named_graph_signatures
The model gets successfully exported using the following piece of code.
I have created a client which makes real-time predictions using TensorFlow Serving.
The following is the code for the client:
flags.DEFINE_string("model_dir", "/tmp/iris_model_dir", "Base directory for output models.")
tf.app.flags.DEFINE_integer('concurrency', 1,
'maximum number of concurrent inference requests')
tf.app.flags.DEFINE_string('server', '', 'PredictionService host:port')
#connection
host, port = FLAGS.server.split(':')
channel = implementations.insecure_channel(host, int(port))
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
# Classify two new flower samples.
new_samples = np.array([5.8, 3.1, 5.0, 1.7], dtype=float)
request = predict_pb2.PredictRequest()
request.model_spec.name = 'iris'
request.inputs["x_values"].CopyFrom(
tf.contrib.util.make_tensor_proto(new_samples))
result = stub.Predict(request, 10.0) # 10 secs timeout
However, on making the predictions, the following error is displayed:
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INTERNAL, details="Output 0 of type double does not match declared output type string for node _recv_input_example_tensor_0 = _Recv[client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=2016246895612781641, tensor_name="input_example_tensor:0", tensor_type=DT_STRING, _device="/job:localhost/replica:0/task:0/cpu:0"]()")
Here is the entire stack trace.
enter image description here
The iris model is defined in the following manner:
# Specify that all features have real-value data
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=4)]
# Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20, 10],
n_classes=3, model_dir=model_dir)
# Fit model.
classifier.fit(x=training_set.data,
y=training_set.target,
steps=2000)
Kindly guide a solution for this error.
I think the problem is that your signature_fn is going on the else branch and passing predictions as the output to the classification signature, which expects a string output and not a double output. Either use a regression signature function or add something to the graph to get the output in the form of a string.