tensorflow serving input payload - tensorflow

I have a model saved in SavedModel format (.pb). After serving the model without problems i try to make a prediction via tensorflow serving. TF Serving requires me to input the data via a list, otherwise the answer i receive is TypeError: Object of type 'ndarray' is not JSON serializable
. But when i input a list the response is an error
The input is
value = [1, 2, 3, 4, 5]
body = {"signature_name": "serving_default",
"instances": [[values]]}
res = requests.post(url=url, data=json.dumps(body))
and the answer { "error": "In[0] is not a matrix. Instead it has shape [1,1,5]\n\t [[{{node sequential/dense/Relu}}]]" }
I know the model works, the input without using tensorflow serving is
value = np.array([1,2,3,4,5])
model.predict([[value]])
So the problem is how can use tensorflow serving if it requires to use a list as input but the model requires a np.array as input.

I suppose you should do it in this way
value = <ndarray>
data = value.tolist()
body = {
"signature_name": "serving_default",
"instances": data}

Related

TensorFlow Serving export signature without arguments

I would like to add extra signature to SavadModel, which will return business description and serve it with TensorFlow Serving.
#tf.function
def info():
return json.dumps({
'name': 'My model',
'description': 'This is model description.',
'project': 'Product ABCD',
'type': 'some_type',
...
})
As is written in TensorFlow Core manual https://www.tensorflow.org/guide/saved_model#identifying_a_signature_to_export, I can easily export signature which accepts arguments providing tf.TensorSpec.
Is it possible to export signature without arguments and call it on server?
Added after #EricMcLachlan comments:
When I try to call a function without defined signature (input_signature=[]) with a code like this:
data = json.dumps({"signature_name": "info", "inputs": None})
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/my_model:predict', data=data, headers=headers)
I get next error in the response:
'_content': b'{ "error": "Failed to get input map for signature: info" }'
Defining the Signature:
I was going to write my own example, but here's such a great example provided by #AntPhitlok in another StackOverflow post:
class MyModule(tf.Module):
def __init__(self, model, other_variable):
self.model = model
self._other_variable = other_variable
#tf.function(input_signature=[tf.TensorSpec(shape=(None, None, 1), dtype=tf.float32)])
def score(self, waveform):
result = self.model(waveform)
return { "scores": results }
#tf.function(input_signature=[])
def metadata(self):
return { "other_variable": self._other_variable }
In this case, they're serving is a Module, but it could have been a Keras model as well.
Using the Serving:
I am not 100% sure how to access the serving (I haven't done it myself yet) but I think you'll be able to access the serving similarly to this:
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
request = predict_pb2.PredictRequest()
request.model_spec.name = model_name
request.model_spec.signature_name = 'serving_default'
request.model_spec.version_label = self.version
tensor_proto = tf.make_tensor_proto(my_input_data, dtype=tf.float32)
request.inputs['my_signatures_input'].CopyFrom(tensor_proto)
try:
response = self.stub.Predict(request, MAX_TIMEOUT)
except Exception as ex:
logging.error(str(ex))
return [None] * len(batch_of_texts)
Here I'm using gRPC to access the TensorFlow Server.
You'd probably need to substitute 'serving_default' with your serving name. Similarly, 'my_signature_input' should match the input to your tf.function (in your case, I think it's empty).
This is a normal standard Keras type prediction and is piggybacking of predict_pb2.PredictRequest. It might be necessary to create a custom Protobuf but that's a bit beyond my abilities at this point.
I hope it's enough to get you going.

How do I need to modify exporting a keras model to accept b64 string to RESTful API/Google cloud ML

The complete code for exporting the model: (I've already trained it and now loading from weights file)
def cnn_layers(inputs):
conv_base= keras.applications.mobilenetv2.MobileNetV2(input_shape=(224,224,3), input_tensor=inputs, include_top=False, weights='imagenet')
for layer in conv_base.layers[:-200]:
layer.trainable = False
last_layer = conv_base.output
x = GlobalAveragePooling2D()(last_layer)
x= keras.layers.GaussianNoise(0.3)(x)
x = Dense(1024,name='fc-1')(x)
x = keras.layers.BatchNormalization()(x)
x = keras.layers.advanced_activations.LeakyReLU(0.3)(x)
x = Dropout(0.4)(x)
x = Dense(512,name='fc-2')(x)
x = keras.layers.BatchNormalization()(x)
x = keras.layers.advanced_activations.LeakyReLU(0.3)(x)
x = Dropout(0.3)(x)
out = Dense(10, activation='softmax',name='output_layer')(x)
return out
model_input = layers.Input(shape=(224,224,3))
model_output = cnn_layers(model_input)
test_model = keras.models.Model(inputs=model_input, outputs=model_output)
weight_path = os.path.join(tempfile.gettempdir(), 'saved_wt.h5')
test_model.load_weights(weight_path)
export_path='export'
from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import utils
from tensorflow.python.saved_model import tag_constants, signature_constants
from tensorflow.python.saved_model.signature_def_utils_impl import build_signature_def, predict_signature_def
from tensorflow.contrib.session_bundle import exporter
builder = saved_model_builder.SavedModelBuilder(export_path)
signature = predict_signature_def(inputs={'image': test_model.input},
outputs={'prediction': test_model.output})
with K.get_session() as sess:
builder.add_meta_graph_and_variables(sess=sess,
tags=[tag_constants.SERVING],
signature_def_map={'predict': signature})
builder.save()
And the output of  (dir 1 has saved_model.pb and models dir) :
python /tensorflow/python/tools/saved_model_cli.py show --dir /1 --all   is
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['predict']:
The given SavedModel SignatureDef contains the following input(s):
inputs['image'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 224, 224, 3)
name: input_1:0
The given SavedModel SignatureDef contains the following output(s):
outputs['prediction'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 107)
name: output_layer/Softmax:0
Method name is: tensorflow/serving/predict
To accept b64 string:
The code was written for (224, 224, 3) numpy array. So, the modifications I made for the above code are:
_bytes should be added to input when passing as b64. So,
predict_signature_def(inputs={'image':......
          changed to
predict_signature_def(inputs={'image_bytes':.....
Earlier, type(test_model.input) is : (224, 224, 3) and dtype: DT_FLOAT. So,
signature = predict_signature_def(inputs={'image': test_model.input},.....
          changed to (reference)
temp = tf.placeholder(shape=[None], dtype=tf.string)
signature = predict_signature_def(inputs={'image_bytes': temp},.....
Edit:
Code to send using requests is : (As mentioned in the comments)
encoded_image = None
with open('/1.jpg', "rb") as image_file:
encoded_image = base64.b64encode(image_file.read())
object_for_api = {"signature_name": "predict",
"instances": [
{
"image_bytes":{"b64":encoded_image}
#"b64":encoded_image (or this way since "image" is not needed)
}]
}
p=requests.post(url='http://localhost:8501/v1/models/mnist:predict', json=json.dumps(object_for_api),headers=headers)
print(p)
I'm getting <Response [400]> error. I think there's no error in the way I'm sending. Something needs to be changed in the code for exporting the model and specifically in
temp = tf.placeholder(shape=[None], dtype=tf.string).
Looking at the docs you've provided what you're looking to do is to take the image and send it in to the API. Images are easily transferable in a text format if you encode them, base64 being pretty much the standard. So what we want to do is create a json object with the image as base64 in the right place and then send this json object into the REST api. python has the requests library which makes sending in a python dictionary as JSON very easy.
So take the image, encode it, put it in a dictionary and send it off using requests:
import requests
import base64
encoded_image = None
with open("image.png", "rb") as image_file:
encoded_image = base64.b64encode(image_file.read())
object_for_api = {"signature_name": "predict",
"instances": [
{
"image": {"b64": encoded_image}
}]
}
requests.post(url='http://localhost:8501/v1/models/mnist:predict', json=object_for_api)
You can also encode your numpy array into JSON but it doesn't seem that the API docs are looking for that.
Two side notes:
I encourage you to use tf.saved_model.simple_save
You may find model_to_estimator convenient.
While your model seems like it will work for requests (the output of saved_model_cli shows the outer dimension is None for both inputs and outputs), it's fairly inefficient to send JSON arrays of floats
To the last point, it's often easier to modify the code to do the image decoding server side so you're sending a base64 encoded JPG or PNG over the wire instead of an array of floats. Here's one example for Keras (I plan to update that answer with simpler code).

How to make REST calls to a served TensorFlow model using JSON?

I have built and trained a TensorFlow model, which is deployed using the tf.Estimator paradigm. I have built a serving function like the one below:
def serving_input_fn(params):
feature_placeholders = {
'inputs' : tf.placeholder(tf.int64, [None], name='inputs')
}
features = {
key: tensor
for key, tensor in feature_placeholders.items()
}
return tf.estimator.export.ServingInputReceiver(features, feature_placeholders)
Now, I want to be able to call it using application/json as content type. So I built a JSON file like the example I found in this question:
payload = {'instances': [{'inputs': [1039]}]}
json_string = json.dumps(payload)
When I invoke the model I get back:
ERROR in serving: Unsupported request data format: {u'instances': [{u'inputs': [1039]}]}.
Valid formats: tensor_pb2.TensorProto, dict<string, tensor_pb2.TensorProto> and predict_pb2.PredictRequest
Any ideas how I can achieve my goal?
As it turns out the JSON should be:
request = {'dtype': 'DT_INT64',
'tensorShape': {'dim':[{'size': 1}]},
'int64Val': [1039]}
json_string = json.dumps(request)

How to read a utf-8 encoded binary string in tensorflow?

I am trying to convert an encoded byte string back into the original array in the tensorflow graph (using tensorflow operations) in order to make a prediction in a tensorflow model. The array to byte conversion is based on this answer and it is the suggested input to tensorflow model prediction on google cloud's ml-engine.
def array_request_example(input_array):
input_array = input_array.astype(np.float32)
byte_string = input_array.tostring()
string_encoded_contents = base64.b64encode(byte_string)
return string_encoded_contents.decode('utf-8')}
Tensorflow code
byte_string = tf.placeholder(dtype=tf.string)
audio_samples = tf.decode_raw(byte_string, tf.float32)
audio_array = np.array([1, 2, 3, 4])
bstring = array_request_example(audio_array)
fdict = {byte_string: bstring}
with tf.Session() as sess:
[tf_samples] = sess.run([audio_samples], feed_dict=fdict)
I have tried using decode_raw and decode_base64 but neither return the original values.
I have tried setting the the out_type of decode raw to the different possible datatypes and tried altering what data type I am converting the original array to.
So, how would I read the byte array in tensorflow? Thanks :)
Extra Info
The aim behind this is to create the serving input function for a custom Estimator to make predictions using gcloud ml-engine local predict (for testing) and using the REST API for the model stored on the cloud.
The serving input function for the Estimator is
def serving_input_fn():
feature_placeholders = {'b64': tf.placeholder(dtype=tf.string,
shape=[None],
name='source')}
audio_samples = tf.decode_raw(feature_placeholders['b64'], tf.float32)
# Dummy function to save space
power_spectrogram = create_spectrogram_from_audio(audio_samples)
inputs = {'spectrogram': power_spectrogram}
return tf.estimator.export.ServingInputReceiver(inputs, feature_placeholders)
Json request
I use .decode('utf-8') because when attempting to json dump the base64 encoded byte strings I receive this error
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: b'longbytestring'
Prediction Errors
When passing the json request {'audio_bytes': 'b64': bytestring} with gcloud local I get the error
PredictionError: Invalid inputs: Expected tensor name: b64, got tensor name: [u'audio_bytes']
So perhaps google cloud local predict does not automatically handle the audio bytes and base64 conversion? Or likely somethings wrong with my Estimator setup.
And the request {'instances': [{'audio_bytes': 'b64': bytestring}]} to REST API gives
{'error': 'Prediction failed: Error during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Input to DecodeRaw has length 793713 that is not a multiple of 4, the size of float\n\t [[Node: DecodeRaw = DecodeRaw[_output_shapes=[[?,?]], little_endian=true, out_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_source_0_0)]]")'}
which confuses me as I explicitly define the request to be a float and do the same in the serving input receiver.
Removing audio_bytes from the request and utf-8 encoding the byte strings allows me to get predictions, though in testing the decoding locally, I think the audio is being incorrectly converted from the byte string.
The answer that you referenced, is written assuming you are running the model on CloudML Engine's service. The service actually takes care of the JSON (including UTF-8) and base64 encoding.
To get your code working locally or in another environment, you'll need the following changes:
def array_request_example(input_array):
input_array = input_array.astype(np.float32)
return input_array.tostring()
byte_string = tf.placeholder(dtype=tf.string)
audio_samples = tf.decode_raw(byte_string, tf.float32)
audio_array = np.array([1, 2, 3, 4])
bstring = array_request_example(audio_array)
fdict = {byte_string: bstring}
with tf.Session() as sess:
tf_samples = sess.run([audio_samples], feed_dict=fdict)
That said, based on your code, I suspect you are looking to send data as JSON; you can use gcloud local predict to simulate CloudML Engine's service. Or, if you prefer to write your own code, perhaps something like this:
def array_request_examples,(input_arrays):
"""input_arrays is a list (batch) of np_arrays)"""
input_arrays = (a.astype(np.float32) for a in input_arrays)
# Convert each image to byte strings
bytes_strings = (a.tostring() for a in input_arrays)
# Base64 encode the data
encoded = (base64.b64encode(b) for b in bytes_strings)
# Create a list of images suitable to send to the service as JSON:
instances = [{'audio_bytes': {'b64': e}} for e in encoded]
# Create a JSON request
return json.dumps({'instances': instances})
def parse_request(request):
# non-TF to simulate the CloudML Service which does not expect
# this to be in the submitted graphs.
instances = json.loads(request)['instances']
return [base64.b64decode(i['audio_bytes']['b64']) for i in instances]
byte_strings = tf.placeholder(dtype=tf.string, shape=[None])
decode = lambda raw_byte_str: tf.decode_raw(raw_byte_str, tf.float32)
audio_samples = tf.map_fn(decode, byte_strings, dtype=tf.float32)
audio_array = np.array([1, 2, 3, 4])
request = array_request_examples([audio_array])
fdict = {byte_strings: parse_request(request)}
with tf.Session() as sess:
tf_samples = sess.run([audio_samples], feed_dict=fdict)

"Output 0 of type double does not match declared output type string" while running the iris sample program in TensorFlow Serving

I am running the sample iris program in TensorFlow Serving. Since it is a TF.Learn model, I am exporting the model using the following classifier.export(export_dir=model_dir,signature_fn=my_classification_signature_fn) and the signature_fn is defined as shown below:
def my_classification_signature_fn(examples, unused_features, predictions):
"""Creates classification signature from given examples and predictions.
Args:
examples: `Tensor`.
unused_features: `dict` of `Tensor`s.
predictions: `Tensor` or dict of tensors that contains the classes tensor
as in {'classes': `Tensor`}.
Returns:
Tuple of default classification signature and empty named signatures.
Raises:
ValueError: If examples is `None`.
"""
if examples is None:
raise ValueError('examples cannot be None when using this signature fn.')
if isinstance(predictions, dict):
default_signature = exporter.classification_signature(
examples, classes_tensor=predictions['classes'])
else:
default_signature = exporter.classification_signature(
examples, classes_tensor=predictions)
named_graph_signatures={
'inputs': exporter.generic_signature({'x_values': examples}),
'outputs': exporter.generic_signature({'preds': predictions})}
return default_signature, named_graph_signatures
The model gets successfully exported using the following piece of code.
I have created a client which makes real-time predictions using TensorFlow Serving.
The following is the code for the client:
flags.DEFINE_string("model_dir", "/tmp/iris_model_dir", "Base directory for output models.")
tf.app.flags.DEFINE_integer('concurrency', 1,
'maximum number of concurrent inference requests')
tf.app.flags.DEFINE_string('server', '', 'PredictionService host:port')
#connection
host, port = FLAGS.server.split(':')
channel = implementations.insecure_channel(host, int(port))
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
# Classify two new flower samples.
new_samples = np.array([5.8, 3.1, 5.0, 1.7], dtype=float)
request = predict_pb2.PredictRequest()
request.model_spec.name = 'iris'
request.inputs["x_values"].CopyFrom(
tf.contrib.util.make_tensor_proto(new_samples))
result = stub.Predict(request, 10.0) # 10 secs timeout
However, on making the predictions, the following error is displayed:
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INTERNAL, details="Output 0 of type double does not match declared output type string for node _recv_input_example_tensor_0 = _Recv[client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=2016246895612781641, tensor_name="input_example_tensor:0", tensor_type=DT_STRING, _device="/job:localhost/replica:0/task:0/cpu:0"]()")
Here is the entire stack trace.
enter image description here
The iris model is defined in the following manner:
# Specify that all features have real-value data
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=4)]
# Build 3 layer DNN with 10, 20, 10 units respectively.
classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20, 10],
n_classes=3, model_dir=model_dir)
# Fit model.
classifier.fit(x=training_set.data,
y=training_set.target,
steps=2000)
Kindly guide a solution for this error.
I think the problem is that your signature_fn is going on the else branch and passing predictions as the output to the classification signature, which expects a string output and not a double output. Either use a regression signature function or add something to the graph to get the output in the form of a string.