I have a sagemaker tensorflow model using a custom estimator, similar to the abalone.py sagemaker tensorflow example, using build_raw_serving_input_receiver_fn in the serving_input_fn:
def serving_input_fn(params):
tensor = tf.placeholder(tf.float32, shape=[1, NUM_FEATURES])
return build_raw_serving_input_receiver_fn({INPUT_TENSOR_NAME: tensor})()
Predictions are being request from java-script using json:
response = #client.invoke_endpoint(
endpoint_name: #name,
content_type: "application/json",
accept: "application/json",
body: values.to_json
)
Everything fine so far. Now I want to add some feature engineering (scaling transformations on the features using a scaler derived from the training data). Following the pattern of the answer for Data Normalization with tensorflow tf-transform
I've now got serving_input_fn like this:
def serving_input_fn(params):
feature_placeholders = {
'f1': tf.placeholder(tf.float32, [None]),
'f2': tf.placeholder(tf.float32, [None]),
'f3': tf.placeholder(tf.float32, [None]),
}
features = {
key: tf.expand_dims(tensor, -1)
for key, tensor in feature_placeholders.items()
}
return tf.estimator.export.ServingInputReceiver(add_engineering(features), feature_placeholders)
From saved_model_cli show --dir . --all I can see the input signature has changed:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['f1'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder_1:0
inputs['f2'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder_2:0
inputs['f3'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder:0
How do I prepare features for prediction from this new model? In python I've been unsuccessfully trying things like
requests = [{'f1':[0.1], 'f2':[0.1], 'f3':[0.2]}]
predictor.predict(requests)
also need to send prediction requests from java-script.
You can define an
def input_fn(data=None, content_type=None):
This would be called directly when a call is made to SageMaker. You can do your feature preparation in this function. model_fn would be called after this function.
Make sure you return a dict of string and TensorProto.
dict{"input tensor name", TensorProto} from the input_fn method.
You can find more details where
https://docs.aws.amazon.com/sagemaker/latest/dg/tf-training-inference-code-template.html
A sample input_fn would look something like below
def input_fn(data=None, content_type=None):
"""
Args:
data: An Amazon SageMaker InvokeEndpoint request body
content_type: An Amazon SageMaker InvokeEndpoint ContentType value for data.
Returns:
object: A deserialized object that will be used by TensorFlow serving as input.
"""
# `inputs` is based on the parameters defined in the model spec's signature_def
return {"inputs": tf.make_tensor_proto(data, shape=(1,))}
Have managed to make the feature values available on their way into prediction via a sagemaker input_fn definition, as suggested by Raman. It means going back to the build_raw_serving_input_receiver_fn serving_input_fn I started with (top of post). The input_fn looks like this:
def input_fn(data=None, content_type=None):
if content_type == 'application/json':
values = np.asarray(json.loads(data))
return {"inputs": tf.make_tensor_proto(values=values, shape=values.shape, dtype=tf.float32)}
else:
return {"inputs": data}
Although I can't pass e.g. a scaler from training into this procedure, it will probably work to embed it in the model.py file that sagemaker requires (which contains this input_fn defn). What I have is responding correctly to by addressed from python either by
data = [[0.1, 0.2, 0.3]]
payload = json.dumps(data)
response = client.invoke_endpoint(
EndpointName=endpoint_name,
Body=payload,
ContentType='application/json'
)
result = json.loads(response['Body'].read().decode())
or
values = np.asarray([[0.1, 0.2, 0.3]])
prediction = predictor.predict(values)
This is all new to me... please recommend improvements/alert me to potential problems if you know of any.
Related
I have an ML model developed using Keras and more accurately, it's using Functional API. Once I save the model and use the saved_model_cli tool on it:
$ saved_model_cli show --dir /serving_model_folder/1673549934 --tag_set serve --signature_def serving_default
2023-01-12 10:59:50.836255: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
The given SavedModel SignatureDef contains the following input(s):
inputs['f1'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: serving_default_f1:0
inputs['f2'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: serving_default_f2:0
inputs['f3'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: serving_default_f3:0
inputs['f4'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: serving_default_f4:0
The given SavedModel SignatureDef contains the following output(s):
outputs['output_0'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: StatefulPartitionedCall_1:0
outputs['output_1'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: StatefulPartitionedCall_1:1
outputs['output_2'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: StatefulPartitionedCall_1:2
Method name is: tensorflow/serving/predict
As you can see, the 3 output attributes are named: output_0, output_1, and output_2. This is how I'm instantiating my model:
input_layers = {
'f1': Input(shape=(1,), name='f1'),
'f2': Input(shape=(1,), name='f2'),
'f3': Input(shape=(1,), name='f3'),
'f4': Input(shape=(1,), name='f4'),
}
x = layers.concatenate(input_layers.values())
x = layers.Dense(32, activation='relu', name="dense")(x)
output_layers = {
't1': layers.Dense(1, activation='sigmoid', name='t1')(x),
't2': layers.Dense(1, activation='sigmoid', name='t2')(x),
't3': layers.Dense(1, activation='sigmoid', name='t3')(x),
}
model = models.Model(input_layers, output_layers)
I was hoping that the saved model would name the output attributes t1, t2, and t3. Searching online, I see that I can rename them if I subclass my model off tf.Model class:
class CustomModuleWithOutputName(tf.Module):
def __init__(self):
super(CustomModuleWithOutputName, self).__init__()
self.v = tf.Variable(1.)
#tf.function(input_signature=[tf.TensorSpec([], tf.float32)])
def __call__(self, x):
return {'custom_output_name': x * self.v}
module_output = CustomModuleWithOutputName()
call_output = module_output.__call__.get_concrete_function(tf.TensorSpec(None, tf.float32))
module_output_path = os.path.join(tmpdir, 'module_with_output_name')
tf.saved_model.save(module_output, module_output_path,
signatures={'serving_default': call_output})
But I would like to keep using the Functional API. Is there any way to specify the name of the output attributes while using Keras Functional API?
I managed to pull this off a different way. It relies on the signature and adds a new layer just to rename the tensors.
from tensorflow.keras import layers
class CustomModuleWithOutputName(layers.Layer):
def __init__(self):
super(CustomModuleWithOutputName, self).__init__()
def call(self, x):
return {'t1': tf.identity(x[0]),
't2': tf.identity(x[1]),
't3': tf.identity(x[2]),}
def _get_tf_examples_serving_signature(model):
#tf.function(input_signature=[tf.TensorSpec(shape=[None, 1], dtype=tf.float32, name='f1'),
tf.TensorSpec(shape=[None, 1], dtype=tf.float32, name='f2'),
tf.TensorSpec(shape=[None, 1], dtype=tf.float32, name='f3'),
tf.TensorSpec(shape=[None, 1], dtype=tf.float32, name='f4'),])
def serve_tf_examples_fn(f1, f2, f3, f4):
"""Returns the output to be used in the serving signature."""
inputs = {'f1': f1, 'f2': f2, 'f3': f3, 'f4': f4}
outputs = model(inputs)
return model.naming_layer(outputs)
return serve_tf_examples_fn
# This is the same model mentioned in the question (a Functional API model)
model = get_model()
# Any property name will do as long as it is not reserved
model.naming_layer = CustomModuleWithOutputName()
signatures = {
'serving_default': _get_tf_examples_serving_signature(model),
}
model.save(output_dir, save_format='tf', signatures=signatures)
The takeaway from this code is the CustomModuleWithOutputName class. It's a subclass of Keras' Layer and all it does is give names to the output indices. This layer is added to the model's graph in the serving_default signature before it is saved. It's a kinda stupid solution but it works. Also, it relies on the order of the tensors returned by the original functional API.
I was hoping my original approach would work. But since it doesn't, at least I have this one to foot the bill.
I have saved a pre-trained version of distilbert, distilbert-base-uncased-finetuned-sst-2-english, from huggingface models, and i am attempting to serve it via Tensorflow Serve and make predictions. All is being tested currently in Colab at the moment.
I am having issue getting the prediction into the correct format for the model via TensorFlow Serve. Tensorflow services are up and running fine serving the model, however my prediction code is not correct and i need some help understanding how to make a prediction via json over the API.
# tokenize and encode a simple positive instance
instances = tokenizer.tokenize('this is the best day of my life!')
instances = tokenizer.encode(instances)
data = json.dumps({"signature_name": "serving_default", "instances": instances, })
print(data)
{"signature_name": "serving_default", "instances": [101, 2023, 2003, 1996, 2190, 2154, 1997, 2026, 2166, 999, 102]}
# setup json_response object
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/my_model:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)
predictions
{'error': '{{function_node __inference__wrapped_model_52602}} {{function_node __inference__wrapped_model_52602}} Incompatible shapes: [11,768] vs. [1,5,768]\n\t [[{{node tf_distil_bert_for_sequence_classification_3/distilbert/embeddings/add}}]]\n\t [[StatefulPartitionedCall/StatefulPartitionedCall]]'}
Any direction here would be appreciated.
Was able to find the solution by setting signatures for input shape and attention mask, which is the following below. This is a simple implementation that uses a fixed input shape for a saved model and requires you to pad the inputs to the expected input shape of 384. I have seen implementations of calling custom signatures and model creation to match expected input shapes, however the below simple case worked for what I was looking to accomplish with serving a huggingface model via TF Serve. If anyone has any better examples or ways to extend this functionality better, please post for future use.
# create callable
from transformers import TFDistilBertForQuestionAnswering
distilbert = TFDistilBertForQuestionAnswering.from_pretrained('distilbert-base-cased-distilled-squad')
callable = tf.function(distilbert.call)
By calling get_concrete_function, we trace-compile the TensorFlow operations of the model for an input signature composed of two Tensors of shape [None, 384], the first one being the input ids and the second one the attention mask.
concrete_function = callable.get_concrete_function([tf.TensorSpec([None, 384], tf.int32, name="input_ids"), tf.TensorSpec([None, 384], tf.int32, name="attention_mask")])
save the model with the signatures:
# stored model path for TF Serve (1 = version 1) --> '/path/to/my/model/distilbert_qa/1/'
distilbert_qa_save_path = 'path_to_model'
tf.saved_model.save(distilbert, distilbert_qa_save_path, signatures=concrete_function)
check to see that it contains the correct signature:
saved_model_cli show --dir 'path_to_model' --tag_set serve --signature_def serving_default
output should look like:
The given SavedModel SignatureDef contains the following input(s):
inputs['attention_mask'] tensor_info:
dtype: DT_INT32
shape: (-1, 384)
name: serving_default_attention_mask:0
inputs['input_ids'] tensor_info:
dtype: DT_INT32
shape: (-1, 384)
name: serving_default_input_ids:0
The given SavedModel SignatureDef contains the following output(s):
outputs['output_0'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 384)
name: StatefulPartitionedCall:0
outputs['output_1'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 384)
name: StatefulPartitionedCall:1
Method name is: tensorflow/serving/predict
TEST MODEL:
from transformers import DistilBertTokenizer
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-cased')
question, text = "Who was Benjamin?", "Benjamin was a silly dog."
input_dict = tokenizer(question, text, return_tensors='tf')
start_scores, end_scores = distilbert(input_dict)
all_tokens = tokenizer.convert_ids_to_tokens(input_dict["input_ids"].numpy()[0])
answer = ' '.join(all_tokens[tf.math.argmax(start_scores, 1)[0] : tf.math.argmax(end_scores, 1)[0]+1])
FOR TF SERVE (in colab): (which was my original intent with this)
!echo "deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | tee /etc/apt/sources.list.d/tensorflow-serving.list && \
curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | apt-key add -
!apt update
!apt-get install tensorflow-model-server
import os
# path_to_model --> versions directory --> '/path/to/my/model/distilbert_qa/'
# actual stored model path version 1 --> '/path/to/my/model/distilbert_qa/1/'
MODEL_DIR = 'path_to_model'
os.environ["MODEL_DIR"] = os.path.abspath(MODEL_DIR)
%%bash --bg
nohup tensorflow_model_server --rest_api_port=8501 --model_name=my_model --model_base_path="${MODEL_DIR}" >server.log 2>&1
!tail server.log
MAKE A POST REQUEST:
import json
!pip install -q requests
import requests
import numpy as np
max_length = 384 # must equal model signature expected input value
question, text = "Who was Benjamin?", "Benjamin was a good boy."
# padding='max_length' pads the input to the expected input length (else incompatible shapes error)
input_dict = tokenizer(question, text, return_tensors='tf', padding='max_length', max_length=max_length)
input_ids = input_dict["input_ids"].numpy().tolist()[0]
att_mask = input_dict["attention_mask"].numpy().tolist()[0]
features = [{'input_ids': input_ids, 'attention_mask': att_mask}]
data = json.dumps({ "signature_name": "serving_default", "instances": features})
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/my_model:predict', data=data, headers=headers)
print(json_response)
predictions = json.loads(json_response.text)['predictions']
all_tokens = tokenizer.convert_ids_to_tokens(input_dict["input_ids"].numpy()[0])
answer = ' '.join(all_tokens[tf.math.argmax(predictions[0]['output_0']) : tf.math.argmax(predictions[0]['output_1'])+1])
print(answer)
I saved keras .h5 model to .pb using SavedModelBuilder. After I use docker image of tensorflow/serving:1.14.0 deploy my model, when I run predict process, I got the "requests.exceptions.HTTPError: 501 Server Error: Not Implemented for url: http://localhost:8501/v1/models/genre:predict"
The model building code as follows:
from keras import backend as K
import tensorflow as tf
from keras.models import load_model
model=load_model('/home/li/model.h5')
model_signature =
tf.saved_model.signature_def_utils.predict_signature_def(
inputs={'input': model.input}, outputs={'output': model.output})
#export_path = os.path.join(model_path,model_version)
export_path = "/home/li/genre/1"
builder = tf.saved_model.builder.SavedModelBuilder(export_path)
builder.add_meta_graph_and_variables(
sess=K.get_session(),
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map={
'predict':
model_signature,
'serving_default':
model_signature
})
builder.save()
Then I got the .pb model:
When I run saved_model_cli show --dir /home/li/genre/1 --all, The saved .pd model infomation as follows:
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['predict']:
The given SavedModel SignatureDef contains the following input(s):
inputs['input'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1, 128, 1292)
name: conv2d_1_input_2:0
The given SavedModel SignatureDef contains the following output(s):
outputs['output'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 19)
name: dense_2_2/Softmax:0
Method name is: tensorflow/serving/predict
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['input'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1, 128, 1292)
name: conv2d_1_input_2:0
The given SavedModel SignatureDef contains the following output(s):
outputs['output'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 19)
name: dense_2_2/Softmax:0
Method name is: tensorflow/serving/predict
The command I use to deploy on docker image tensorflow/serving is
docker run -p 8501:8501 --name tfserving_genre --mount type=bind,source=/home/li/genre,target=/models/genre -e MODEL_NAME=genre -t tensorflow/serving:1.14.0 &
When open http://localhost:8501/v1/models/genre in browser, I got the message
{
"model_version_status": [
{
"version": "1",
"state": "AVAILABLE",
"status": {
"error_code": "OK",
"error_message": ""
}
}
]
}
The client prediction code as follows:
import requests
import numpy as np
import os
import sys
from audio_to_spectrum_v2 import split_song_to_frames
# Define a Base client class for Tensorflow Serving
class TFServingClient:
"""
This is a base class that implements a Tensorflow Serving client
"""
TF_SERVING_URL_FORMAT = '{protocol}://{hostname}: {port}/v1/models/{endpoint}:predict'
def __init__(self, hostname, port, endpoint, protocol="http"):
self.protocol = protocol
self.hostname = hostname
self.port = port
self.endpoint = endpoint
def _query_service(self, req_json):
"""
:param req_json: dict (as define in https://cloud.google.com/ml-engine/docs/v1/predict-request)
:return: dict
"""
server_url = self.TF_SERVING_URL_FORMAT.format(protocol=self.protocol,
hostname=self.hostname,
port=self.port,
endpoint=self.endpoint)
response = requests.post(server_url, json=req_json)
response.raise_for_status()
print(response.json())
return np.array(response.json()['output'])
# Define a specific client for our inception_v3 model
class GenreClient(TFServingClient):
# INPUT_NAME is the config value we used when saving the model (the only value in the `input_names` list)
INPUT_NAME = "input"
def load_song(self, song_path):
"""Load a song from path,slices to pieces, and extract features, returned as np.array format"""
song_pieces = split_song_to_frames(song_path,False,30)
return song_pieces
def predict(self, song_path):
song_pieces = self.load_song(song_path)
# Create a request json dict
req_json = {
"instances": song_pieces.tolist()
}
print(req_json)
return self._query_service(req_json)
def main():
song_path=sys.argv[1]
print("file name:{}".format(os.path.split(song_path)[-1]))
hostname = "localhost"
port = "8501"
endpoint="genre"
client = GenreClient(hostname=hostname, port=port, endpoint=endpoint)
prediction = client.predict(song_path)
print(prediction)
if __name__=='__main__':
main()
After run the prediction code, I got the error information as follows:
Traceback (most recent call last):
File "client_predict.py", line 90, in <module>
main()
File "client_predict.py", line 81, in main
prediction = client.predict(song_path)
File "client_predict.py", line 69, in predict
return self._query_service(req_json)
File "client_predict.py", line 40, in _query_service
response.raise_for_status()
File "/home/li/anaconda3/lib/python3.7/site-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 501 Server Error: Not Implemented for url: http://localhost:8501/v1/models/genre:predict
I wonder what's the reason of this deployment problem, and how to solve it, Thanks for all.
I've tried to print the response use
pred = json.loads(r.content.decode('utf-8'))
print(pred)
The problem is caused by the "conv implementation only supports NHWC tensor format for now."
At last, I change the data format from NCHW to NWHC in Conv2d
While using the following code and doing a gcloud ml-engine local predict I get:
InvalidArgumentError (see above for traceback): You must feed a value
for placeholder tensor 'Placeholder' with dtype string and shape [?]
[[Node: Placeholder = Placeholderdtype=DT_STRING, shape=[?], _device="/job:localhost/replica:0/task:0/device:CPU:0"]] (Error code: 2)
tf_files_path = './tf'
# os.makedirs(tf_files_path) # temp dir
estimator =\
tf.keras.estimator.model_to_estimator(keras_model_path="model_data/yolo.h5",
model_dir=tf_files_path)
#up_one_dir(os.path.join(tf_files_path, 'keras'))
def serving_input_receiver_fn():
def prepare_image(image_str_tensor):
image = tf.image.decode_jpeg(image_str_tensor,
channels=3)
image = tf.divide(image, 255)
image = tf.image.convert_image_dtype(image, tf.float32)
return image
# Ensure model is batchable
# https://stackoverflow.com/questions/52303403/
input_ph = tf.placeholder(tf.string, shape=[None])
images_tensor = tf.map_fn(
prepare_image, input_ph, back_prop=False, dtype=tf.float32)
return tf.estimator.export.ServingInputReceiver(
{model.input_names[0]: images_tensor},
{'image_bytes': input_ph})
export_path = './export'
estimator.export_savedmodel(
export_path,
serving_input_receiver_fn=serving_input_receiver_fn)
The json I am sending to the ml engine looks like this:
{"image_bytes": {"b64": "/9j/4AAQSkZJRgABAQAAAQABAAD/2w..."}}
When not doing a local prediction, but sending it to ML engine itself, I get:
ERROR: (gcloud.ml-engine.predict) HTTP request failed. Response: {
"error": {
"code": 500,
"message": "Internal error encountered.",
"status": "INTERNAL"
}
}
The saved_model_cli gives:
saved_model_cli show --all --dir export/1547848897/
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['image_bytes'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
outputs['conv2d_59'] tensor_info:
dtype: DT_FLOAT
shape: (-1, -1, -1, 255)
name: conv2d_59/BiasAdd:0
outputs['conv2d_67'] tensor_info:
dtype: DT_FLOAT
shape: (-1, -1, -1, 255)
name: conv2d_67/BiasAdd:0
outputs['conv2d_75'] tensor_info:
dtype: DT_FLOAT
shape: (-1, -1, -1, 255)
name: conv2d_75/BiasAdd:0
Method name is: tensorflow/serving/predict
Does anyone see what is going wrong here?
The issue has been resolved. The output of the model appeared to be too big for ml-engine to send it back and it didn't capture it in a more relevant exception than 500 internal error. We added some post-processing steps in the model and it works fine now.
For the gcloud ml-engine local predict command that is returning an error, it seems to be a bug. As the model works on ml-engine now, but still does return this error with local prediction.
I'm currently working with tensorflow Estimator API and have problems with the confusing serving options that are available. My confusion comes from the very undetailed tensorflow documentation.
This is my goal:
Use tensorflow-serving prediction_service_pb2 by sending a serialized proto message as string to the ServingInputReceiver function of my exported Estimator model. I expect the ServingInputReceiver function to receive the serialized proto string on the "input" tensor which then will deserialize it to the features "ink" (=varlength float array) and "shape" (=fixedlength int64).
This is my (implementation of google quickdraw model) estimator Input function:
def _parse_tfexample_fn(example_proto, mode):
"""Parse a single record which is expected to be a tensorflow.Example."""
feature_to_type = {
"ink": tf.VarLenFeature(dtype=tf.float32),
"shape": tf.FixedLenFeature([2], dtype=tf.int64)
}
if mode != tf.estimator.ModeKeys.PREDICT:
# The labels won't be available at inference time, so don't add them
# to the list of feature_columns to be read.
feature_to_type["class_index"] = tf.FixedLenFeature([1], dtype=tf.int64)
parsed_features = tf.parse_single_example(example_proto, feature_to_type)
parsed_features["ink"] = tf.sparse_tensor_to_dense(parsed_features["ink"])
if mode != tf.estimator.ModeKeys.PREDICT:
labels = parsed_features["class_index"]
return parsed_features, labels
else:
return parsed_features # In prediction, we have no labels
This is my Serving Input Function:
def serving_input_receiver_fn():
"""An input receiver that expects a serialized tf.Example."""
feature_to_type = {"ink": tf.VarLenFeature(dtype=tf.float32), "shape": tf.FixedLenFeature([2], dtype=tf.int64)}
serialized_tf_example = tf.placeholder(dtype=tf.string, shape=[None], name='input')
parsed_features = tf.parse_example(serialized_tf_example, feature_to_type)
parsed_features["ink"] = tf.sparse_tensor_to_dense(parsed_features["ink"])
return tf.estimator.export.ServingInputReceiver(parsed_features, serialized_tf_example)
This is my client.py request:
features = {}
features["ink"] = tf.train.Feature(float_list=tf.train.FloatList(value=np_ink.flatten()))
features["shape"] = tf.train.Feature(int64_list=tf.train.Int64List(value=np_ink.shape))
f = tf.train.Features(feature=features)
data = tf.train.Example(features=f)
serialized=data.SerializeToString() # tensor to byte string
request.inputs['input'].ParseFromString(tf.contrib.util.make_tensor_proto(serialized, shape=[1], verify_shape=True))
And this is the error I get after calling the Predict function in client.py
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="input tensor alias not found in signature: ink. Inputs expected to be in the set {input}.")
I tried the following Servingfunctions:
ServingInputReceiver and build_raw_serving_input_receiver_fn give me the same grpc error. When I use build_parsing_serving_input_receiver_fn it wont even export my model. I tried to wrap my head around the documentation but it is very undetailed and I don't understand when to use which serving input function.