Tensorflow serving: Unable to base64 decode - tensorflow

I use the slim package resnet_v2_152 to train a classification model.
Then it is exported to .pb file to provide a service.
Because the input is image, so it would be encoded with web-safe base64 encoding. It looks like:
serialized_tf_example = tf.placeholder(dtype=tf.string, name='tf_example')
decoded = tf.decode_base64(serialized_tf_example)
I then encode an image with base64 such that:
img_path = '/Users/wuyanxue/Desktop/not_emoji1.jpeg'
img_b64 = base64.b64encode(open(img_path, 'rb').read())
s = str(img_b64, encoding='utf-8')
s = s.replace('+', '-').replace(r'/', '_')
My post data is as structured as follow:
post_data = {
'signature_name': 'predict',
'instances':[ {
{ 'b64': s }
Finally, I post a HTTP request to this server:
res = requests.post('server_address', json=post_data)
It gives me:
'{ "error": "Failed to process element: 0 key: inputs of \\\'instances\\\' list. Error: Invalid argument: Unable to base64 decode" }'
I want to know how it could be encountered? And are there some solutions for that?

I had the same issue when using python3. I solved it by adding a 'b' - a byte-like object instead of the default str to the encode function:
b'{"instances" : [{"b64": "%s"}]}' % base64.b64encode(
Hope that helps, please see this answer for extra info.

This question is already solved.
post_data = {
'signature_name': 'predict',
'instances':[ {
{ 'b64': s }
We see that inputs is with 'b64' flag, which illustrates that tensorflow serving will decode s with base64 code.
It belongs to the tensorflow serving internal method.
So, the placeholder:
serialized_tf_example = tf.placeholder(dtype=tf.string, name='tf_example')
will directly receive the binary format of the input data BUT NOT base64 format.
So, finally,
decoded = tf.decode_base64(serialized_tf_example)
is NOT necessary.


How to fix UnicodeDecodeError for bytes from requests?

I have the following full working example code using selenium-wire to record all requests made.
import os
import sys
import json
from seleniumwire import webdriver
driver = webdriver.Chrome()
list_requests = []
for request in driver.requests:
req = {
"method": request.method,
"url": request.url,
"body": request.body.decode(), # to avoid json error
"headers": {k:str(v) for k,v in request.headers.__dict__.items()} # to avoid json error
if request.response:
resp = {
"status_code": request.response.status_code,
"reason": request.response.reason,
"body": request.response.body.decode(), # ???
"headers": {k:str(v) for k,v in request.response.headers.__dict__.items()} # to avoid json error
req["response"] = resp
with open(f"test.json", "w") as outfile:
json.dump(list_requests, outfile)
However, the decoding of the response body creates an error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 1: invalid start byte
and without trying to decode the response body I get an error
TypeError: Object of type bytes is not JSON serializable
I do not care about the encoding, I just want to be able to write the 'body' to the json file in some way. If needed the byte/character in question can be removed, I do not care.
Any ideas how this problem can be solved?
I've used the next approach in order to extract some field (some_key) from json response:
from gzip import decompress
import json
some_key = None
for request in driver.requests:
if request.response:
if request.method == 'POST':
print(request.method + ' ' + request.url)
# try to parse the json response to extract the data
data = json.loads(request.response.body)
print('parsed as json')
if 'some_key' in data:
some_key = data['some_key']
except UnicodeDecodeError:
# decompress on UnicodeDecodeError and parse the json response to extract the data
data = json.loads(decompress(request.response.body))
print('decompressed and parsed as json')
if 'some_key' in data:
some_key = data['some_key']
except json.decoder.JSONDecodeError:
data = request.response.body
print('decompressed and not parsed')
gzip.decompress helped me with UnicodeDecodeError.
Hope this will be helpful.

How to send numpy array to sagemaker endpoint using lambda function

How to invoke sagemaker endpoint with input data type numpy.ndarray.
I have deployed a sagemaker model and trying to hit it using lambda function.
But I am unable to figure out how to do it. I am getting server error.
One row of the Input data.
The total data set has shape=(91,5,12).
The below is only one row of Input data.
array([[[0.30440741, 0.30209799, 0.33520652, 0.41558442, 0.69096432,
0.69611016, 0.25153326, 0.98333333, 0.82352941, 0.77187154,
0.7664042 , 0.74468085],
[0.30894981, 0.33151662, 0.22907725, 0.46753247, 0.69437367,
0.70410559, 0.29259044, 0.9 , 0.80882353, 0.79401993,
0.89501312, 0.86997636],
[0.33511896, 0.34338939, 0.24065546, 0.48051948, 0.70384005,
0.71058715, 0.31031288, 0.86666667, 0.89705882, 0.82724252,
0.92650919, 0.89125296],
[0.34617355, 0.36150251, 0.23726854, 0.54545455, 0.71368726,
0.71703244, 0.30228356, 0.85 , 0.86764706, 0.86157254,
0.97112861, 0.94089835],
[0.36269508, 0.35923332, 0.40285461, 0.62337662, 0.73325475,
0.7274392 , 0.26241391, 0.85 , 0.82352941, 0.89922481,
0.9343832 , 0.90780142]]])
I am using the following code but unable to invoke the endpoint
import boto3
def lambda_handler(event, context):
# The SageMaker runtime is what allows us to invoke the endpoint that we've created.
runtime = boto3.Session().client('sagemaker-runtime')
endpoint = 'sagemaker-tensorflow-2019-04-22-07-16-51-717'
print('givendata ', event['body'])
# data = numpy.array([numpy.array(xi) for xi in event['body']])
data = event['body']
print('numpy array ', data)
# Now we use the SageMaker runtime to invoke our endpoint, sending the review we were given
response = runtime.invoke_endpoint(EndpointName = endpoint,# The name of the endpoint we created
ContentType = 'application/json', # The data format that is expected
Body = data) # The actual review
# The response is an HTTP response whose body contains the result of our inference
result = response['Body'].read().decode('utf-8')
print('response', result)
# Round the result so that our web app only gets '1' or '0' as a response.
result = round(float(result))
return {
'statusCode' : 200,
'headers' : { 'Content-Type' : 'text/plain', 'Access-Control-Allow-Origin' : '*' },
'body' : str(result)
I am unable to figure out what should be written in place of ContentType.
Because I am not aware of MIME type in case of numpy.ndarray.
Illustration of what I had and how I solved
from sagemaker.tensorflow import TensorFlowPredictor
predictor = TensorFlowPredictor('sagemaker-tensorflow-serving-date')
data = np.array(raw_data)
response = predictor.predict(data=data)
predictions = response['predictions']
What I did to find the answer:
Looked up predictions.py and content_types.py implementation in sagemaker python library to see what content types it used and what arguments it had.
First I thought that application/x-npy content_type was used and thus tried using serialisation code from predictor.py and passing the application/x-npy as content_type to invoke_endpoint.
After receiving 415 (unsupported media type), the issue was still the content_type. The following print statements helped me to reveal what content_type predictor actually uses (application/json) and thus I took the appropriate serialisation code from predictor.py
from sagemaker.tensorflow import TensorFlowPredictor
predictor = TensorFlowPredictor('sagemaker-tensorflow-serving-date')
data = np.array(raw_data)
response = predictor.predict(data=data)
predictions = response['predictions']
Solution for lambda:
import json
import boto3
ENDPOINT_NAME = 'sagemaker-tensorflow-serving-date'
config = botocore.config.Config(read_timeout=80)
runtime= boto3.client('runtime.sagemaker', config=config)
data = np.array(raw_data)
payload = json.dumps(data.tolist())
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
result = json.loads(response['Body'].read().decode())
res = result['predictions']
Note: numpy is not included in lambda thus you would either include the numpy yourself or instead of data.tolist() operate with python list and json.dump that list (of lists). From your code it seems to me you have python list instead of numpy array, so simple json dump should work.
If you are training and hosting custom algorithm on SageMaker using TensorFlow, you can serialize/de-serialize the request and response format as JSON as in TensorFlow Serving Predict API.
import numpy
from sagemaker.predictor import json_serializer, json_deserializer
# define predictor
predictor = estimator.deploy(1, instance_type)
# format request
data = {'instances': numpy.asarray(np_array).astype(float).tolist()}
# set predictor request/response formats
predictor.accept = 'application/json'
predictor.content_type = 'application/json'
predictor.serializer = json_serializer
predictor.deserializer = json_deserializer
# run inference using SageMaker predict class
# https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/predictor.py
You can refer the example notebook here to train and host custom TensorFlow container.

How to send a byte array as a part of Json with karate framework

I have an endpoint who consumes Json with 2 attributes, like
{id='12344', data=byte_array}
so I've wrote a test
Feature: submitted request
Scenario: submitted request
* def convertToBytes =
function(arg) {
var StreamUtils = Java.type('my.utils.StreamUtils');
// it reads stream and convert it to a byte array
return StreamUtils.getBytes(arg);
Given url 'http://my-server/post'
And def image = convertToBytes(read('classpath:images/image_1.jpg'));
And request {id:1, data: "#(image)"}
When method POST
Then status 200
However is got an exception form karate without much details
ERROR com.intuit.karate - http request failed: [B cannot be cast to [Ljava.lang.Object;
Any hits how to submit byte arrays as a part of Json with karate?
I don't think you can do that. Either the whole request should be binary (byte-array) or you do a multi-part request, where binary is Base64 encoded. As far as I know you can't put binary inside JSON. There is something called Binary JSON though.
EDIT: after assuming that the byte[] has to be Base64 encoded:
* url demoBaseUrl
* def Base64 = Java.type('java.util.Base64')
Scenario: json with byte-array
Given path 'echo', 'binary'
And def encoded = Base64.encoder.encodeToString('hello'.bytes);
And request { message: 'hello', data: '#(encoded)' }
When method post
Then status 200
And def expected = Base64.encoder.encodeToString('world'.bytes);
And match response == { message: 'world', data: '#(expected)' }
I just added this test to the Karate demos, and it is working fine. Here is the commit.

How to upload input file for batch prediction in gcloud ml-engine?

I'm trying to create a batch prediction job in google cloud ml-engine. Unfortunately, I always get the same error:
insertId: "wr85wwg6shs9ek"
logName: "projects/tensorflow-test-1-168615/logs/ml.googleapis.com%2Ftest_job_23847239"
receiveTimestamp: "2017-08-04T16:07:29.524193256Z"
resource: {
labels: {
job_id: "test_job_23847239"
project_id: "tensorflow-test-1-168615"
task_name: "service"
type: "ml_job"
severity: "ERROR"
textPayload: "TypeError: decoding Unicode is not supported"
timestamp: "2017-08-04T16:07:29.524193256Z"
I create the file in java and upload it to a bucket with the following code:
BufferedImage bufferedImage = ImageIO.read(new URL(media.getUrl()));
int[][][] imageMatrix = convertToImageToMatrix(bufferedImage);
String imageString = matrixToString(imageMatrix);
String inputContent = "{\"instances\": [{\"inputs\": " + imageString + "}]}";
byte[] inputBytes = inputContent.getBytes(Charset.forName("UTF-8"));
Blob inputBlob = mlInputBucket.create(media.getId().toString() + ".json", inputBytes, "application/json");
inputPaths.add("gs://" + Properties.getCloudBucketNameInputs() + "/" + inputBlob.getName());
In this code, I download the image, convert it to uint8 matrix and format the matrix as a json string. The file gets created and is present in the bucket. I also verified, that the json file is valid.
In the next step, I collect all created files and start the prediction job:
GoogleCloudMlV1PredictionInput input = new GoogleCloudMlV1PredictionInput();
input.setVersionName("projects/" + DatastoreOptions.getDefaultProjectId() + "/models/" + Properties.getMlEngineModelName() + "/versions/" + Properties.getMlEngineModelVersion());
input.setOutputPath("gs://" + Properties.getCloudBucketNameOutputs() + "/" + jobId);
GoogleCloudMlV1Job job = new GoogleCloudMlV1Job();
engine.projects().jobs().create("projects/" + DatastoreOptions.getDefaultProjectId() , job).execute();
Finally, the job gets created but the result is the one from the beginning.
I also tried to start the job with the gcloud sdk, but the result is the same. But when I modify the file to remove the instances object and match the correct format for for online prediction, it works (To make it work, I need to remove the most of the rows from the input, because of the payload quota for online predictions).
I'm using the trained pets model from the object detection. One of my created input files can be found here.
What I'm doing wrong here?
did I answer your question in tensorflow serving prediction not working with object detection pets example? The input of batch prediction should not include '{"instances: }'.

How to feed inputs into a loaded Tensorflow model using C++

I want to create and train a model, export it and run inference in C++.
I'm following the tutorial listed here: https://www.tensorflow.org/tutorials/wide_and_deep
I'm also trying to use the SavedModel approach as described here since this is the canonical way to export TensorFlow graphs for serving:
At the very end, I export the saved model as follows:
feature_spec = tf.contrib.layers.create_feature_spec_for_parsing(feature_columns)
serving_input_fn = input_fn_utils.build_parsing_serving_input_fn(feature_spec)
output = model.export_savedmodel(model_dir, serving_input_fn, as_text=True)
print('Model saved to {}'.format(output))
I see the saved_model.pbtxt has the following signature definition.
signature_def {
key: "serving_default"
value {
inputs {
key: "inputs"
value {
name: "input_example_tensor:0"
dtype: DT_STRING
tensor_shape {
dim {
size: -1
outputs {
I can load the saved model on the C++ side
SavedModelBundle bundle;
const std::string graph_path = "models/1498572863";
const std::unordered_set<std::string> tags = {"serve"};
Status status = LoadSavedModel(session_options,
run_options, graph_path,
tags, &bundle);
I'm stuck at the last part where I need to feed the input into this model.
The Run function expects the input parameter to be of the form: std::vector<std::pair<string, Tensor>>.
I would have expected this to be a vector of pairs where the key is the feature name used in the python code and the Tensor is multiple values for that feature.
However, it seems to expect the string to be "input_example_tensor".
I'm not sure how I'm supposed to now feed the model with different features using a single Tensor.
std::vector<string> output_tensor_names = {
// How do I create input_tensor?
status = bundle.session->Run({{"input_example_tensor", input_tensor}}
output_tensor_names, {}, &outputs);
I did something like this
tensorflow::Example example;
auto& tf_feature_map = *(example.mutable_features()->mutable_feature());
const std::string& serialized = example.SerializeAsString();
tensorflow::Input input({serialized});
status = bundle.session->Run({{"input_example_tensor", input.tensor()}}
output_tensor_names, {}, &outputs);
Your model signature suggests that it is expecting a DT_STRING tensor as input. When using tensorflow::Example, this typically means that the protocol buffer needs to be serialized into a tensor with a string as the type of its elements.
To convert the tensorflow::Example object to a string, you can use the protocol buffer methods such as SerializeToString, SerializeAsString etc.
Hope that helps.