Deploying a TensorFlow model on Google Cloud that receives a base64 encoded string as a model input

I have successfully setup Google Cloud and deployed a pre-trained ML model that takes an input tensor (image) of shape=(?, 224, 224, 3) and dtype=float32. It works well but this is inefficient when making REST requests and should really use a base64 encoded string. The challenge is that I am using transfer learning and cannot control the input of the original pre-trained model. To get around this with adding additional infrastructure I created a small graph (wrapper) that handles the base64 to array conversion and connected it to my pre-trained model graph yielding a new single graph. The small graph takes an input tensor with the shape=(), dtype=string and return a tensor with the shape=(224, 224, 3), dtype=float32 which can then be passed to the original model. The model compiles to .pb file without errors and successfully deploys but I get the following error when making my Post request:
{'error': 'Prediction failed: Error during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Index out of range using input dim 0; input has only 0 dims\n\t [[{{node lambda/map/while/strided_slice}}]]")'}
Post request body:
{'instances': [{'b64': 'iVBORw0KGgoAAAANSUhEUgAAAOAA...'}]}`
This error leads me to believe the post request is incorrectly formatted for handling the base64 string or my base conversion graph input is setup incorrectly. I can run the code locally by calling predict on my combined model and pass it a tensor in the form of shape=(), dtype=string constructed locally and get a result successfully.
Here is my code for combining the 2 graphs:
import tensorflow as tf
# Local dependencies
from myProject.classifier_models import mobilenet
from myProject.dataset_loader import dataset_loader
from myProject.utils import f1_m, recall_m, precision_m
with tf.keras.backend.get_session() as sess:
def preprocess_and_decode(img_str, new_shape=[224,224]):
#img =
img = tf.image.decode_png(img_str, channels=3)
img = (tf.cast(img, tf.float32)/127.5) - 1
img = tf.image.resize_images(img, new_shape, method=tf.image.ResizeMethod.AREA, align_corners=False)
# If you need to squeeze your input range to [0,1] or [-1,1] do it here
return img
InputLayer = tf.keras.layers.Input(shape = (1,),dtype="string")
OutputLayer = tf.keras.layers.Lambda(lambda img : tf.map_fn(lambda im : preprocess_and_decode(im[0]), img, dtype="float32"))(InputLayer)
base64_model = tf.keras.Model(InputLayer,OutputLayer)
tf.keras.backend.set_learning_phase(0) # Ignore dropout at inference
transfer_model = tf.keras.models.load_model('./trained_model/mobilenet_93.h5', custom_objects={'f1_m': f1_m, 'recall_m': recall_m, 'precision_m': precision_m})
base64_input = base64_model.input
final_output = transfer_model(base64_model.output)
new_model = tf.keras.Model(base64_input,final_output)
export_path = '../myModels/001'
inputs={'input_class': new_model.input},
outputs={'output_class': new_model.output})
Tech: TensorFlow 1.13.1 & Python 3.5
I have looked at a bunch of related posts such as:
Update 06/12/2019:
Inspecting the 3 graph summaries everything appears correctly merged
Update 06/14/2019:
Update 06/14/2019:
Ended up going with this alternative strategy instead, implementing a tf.estimator


Unable to save model with tensorflow 2.0.0 beta1

I have tried all the options described in the documentation but none of them allowed me to save my model in tensorflow 2.0.0 beta1. I've also tried to upgrade to the (also unstable) TF2-RC but that ruined even the code I had working in beta so I quickly rolled back for now to beta.
See a minimal reproduction code below.
What I have tried:"mymodel.h5")
NotImplementedError: Saving the model to HDF5 format requires the
model to be a Functional model or a Sequential model. It does not work
for subclassed models, because such models are defined via the body of
a Python method, which isn't safely serializable. Consider saving to
the Tensorflow SavedModel format (by setting save_format="tf") or
using save_weights."mymodel", format='tf')
ValueError: Model <main.CVAE object at 0x7f1cac2e7c50> cannot be
saved because the input shapes have not been set. Usually, input
shapes are automatically determined from calling .fit() or .predict().
To manually set the shapes, call model._set_inputs(inputs).
model._set_input(input_sample)"mymodel", format='tf')
AssertionError: is not supported inside a traced
#tf.function. Move the call to the outer eagerly-executed context.
And this is where I am stuck now because it gives me no reasonable hint whatsoever. That's because I am NOT calling the save() function from a #tf.function, I'm already calling it from the outermost scope possible. In fact, I have no #tf.function at all in this minimal reproduction script below and still getting the same error.
So I really have no idea how to save my model, I've tried every options and they all throw errors and provide no hints.
The minimal reproduction example below works fine if you set save_model=False and it reproduces the error when save_model=True.
It may seem unnecessary in this simplified auto-encoder code example to use a subclassed model but I have lots of custom functions added to it in my original VAE code that I need it for.
import tensorflow as tf
save_model = True
learning_rate = 1e-4
color_channels = 1
imsize = 28
(train_images, _), (test_images, _) = tf.keras.datasets.mnist.load_data()
train_images = train_images[:5000, ::]
test_images = train_images[:1000, ::]
train_images = train_images.reshape(-1, imsize, imsize, 1).astype('float32')
test_images = test_images.reshape(-1, imsize, imsize, 1).astype('float32')
train_images /= 255.
test_images /= 255.
train_dataset =
test_dataset =
class AE(tf.keras.Model):
def __init__(self):
super(AE, self).__init__() = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(imsize, imsize, color_channels)),
tf.keras.layers.Dense(imsize**2 * color_channels),
tf.keras.layers.Reshape(target_shape=(imsize, imsize, color_channels)),
def decode(self, input):
logits =
return logits
optimizer = tf.keras.optimizers.Adam(learning_rate)
model = AE()
def compute_loss(data):
logits = model.decode(data)
loss = tf.reduce_mean(tf.losses.mean_squared_error(logits, data))
return loss
def train_step(data):
with tf.GradientTape() as tape:
loss = compute_loss(data)
gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
return loss, 0
def test_step(data):
loss = compute_loss(data)
return loss
input_shape_set = False
epoch = 0
epochs = 20
for epoch in range(epochs):
for train_x in train_dataset:
if epoch % 1 == 0:
loss = 0.0
num_batches = 0
for test_x in test_dataset:
loss += test_step(test_x)
num_batches += 1
loss /= num_batches
print("Epoch: {}, Loss: {}".format(epoch, loss))
if save_model:
print("Saving model...")
if not input_shape_set:
# Note: Why set input shape manually and why here:
# 1. If I do not set input shape manually: ValueError: Model <main.CVAE object at 0x7f1cac2e7c50> cannot be saved because the input shapes have not been set. Usually, input shapes are automatically determined from calling .fit() or .predict(). To manually set the shapes, call model._set_inputs(inputs).
# 2. If I set input shape manually BEFORE the first actual train step, I get: RuntimeError: Attempting to capture an EagerTensor without building a function.
input_shape_set = True
# Note: Why choose tf format:'MNIST/Models/model.h5') will return NotImplementedError: Saving the model to HDF5 format requires the model to be a Functional model or a Sequential model. It does not work for subclassed models, because such models are defined via the body of a Python method, which isn't safely serializable. Consider saving to the Tensorflow SavedModel format (by setting save_format="tf") or using save_weights.'MNIST/Models/model', save_format='tf')
I have tried the same minimal reproduction example in tensorflow-gpu 2.0.0-rc0 and the error was more revealing than what the beta version gave me. The error in RC says:
NotImplementedError: When subclassing the Model class, you should
implement a call method.
This got me read through where I found examples of how to do subclassing in TF2 in a way that allows saving. I was able to resolve the error and have the model saved by replacing my 'decode' method by 'call' in the above example (although this will be more complicated with my actual code where I had various methods defined for the class). This solved the error both in beta and in rc. Strangely, the training (or the saving) got also much faster in rc.
You should change two things:
Change the decode method to call, as you pointed out
As your model is of type Sequential, and not built inside the class, you want to call the save method on the attribute of the model, i.e.,'mymodel.h5')
alternatively, to keep things more standard, you can implement this method inside the AE class, as follows:
def save(self, save_dir):
Cheers mate

Tensorflow Serving, online predictions: How to build a signature_def that accepts 'image_bytes' as input tensor name?

I have successfully trained a Keras model and used it for predictions on my local machine, now i want to deploy it using Tensorflow Serving. My model takes images as input and returns a mask prediction.
According to the documentation here my instances need to be formatted like this:
{'image_bytes': {'b64': base64.b64encode(jpeg_data).decode()}}
Now, the saved_model.pb file automatically saved by my Keras model has the following tensor names:
input_tensor = graph.get_tensor_by_name('input_image:0')
output_tensor = graph.get_tensor_by_name('conv2d_23/Sigmoid:0')
therefore i need to save a new saved_model.pb file with a different signature_def.
I tried the following (see here for reference), which works:
with tf.Session(graph=tf.Graph()) as sess:
tf.saved_model.loader.load(sess, ['serve'], 'path/to/saved/model/')
graph = tf.get_default_graph()
input_tensor = graph.get_tensor_by_name('input_image:0')
output_tensor = graph.get_tensor_by_name('conv2d_23/Sigmoid:0')
tensor_info_input = tf.saved_model.utils.build_tensor_info(input_tensor)
tensor_info_output = tf.saved_model.utils.build_tensor_info(output_tensor)
prediction_signature = (
inputs={'image_bytes': tensor_info_input},
outputs={'output_bytes': tensor_info_output},
builder = tf.saved_model.builder.SavedModelBuilder('path/to/saved/new_model/')
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={'predict_images': prediction_signature, })
but when i deploy the model and request predictions to the AI platform, i get the following error:
RuntimeError: Prediction failed: Error processing input: Expected float32, got {'b64': 'Prm4OD7JyEg+paQkPrGwMD7BwEA'} of type 'dict' instead.
readapting the answer here, i also tried to rewrite
input_tensor = graph.get_tensor_by_name('input_image:0')
image_placeholder = tf.placeholder(tf.string, name='b64')
graph_input_def = graph.as_graph_def()
input_tensor, = tf.import_graph_def(
input_map={'b64:0': image_placeholder},
with the (wrong) understanding that this would add a layer on top of my input tensor with matching 'b64' name (as per documentation) that accepts a string and connects it the original input tensor
but the error from the AI platform is the same.
(the relevant code i use for requesting a prediction is:
instances = [{'image_bytes': {'b64': base64.b64encode(image).decode()}}]
response = service.projects().predict(
body={'instances': instances}
where image is a numpy.ndarray of dtype('float32'))
I feel i'm close enough but i'm definitely missing something. Can you please help?
After b64 encoded -> decoded, the buffer of img will be changed to type string and not fit your model input type.
You may try to add some preprocess in your model and send b64 request again.

How do I use a pretrained network as a layer in Tensorflow?

I want to use a feature extractor (such as ResNet101) and add layers after that which use the output of the feature extractor layer. However, I can't seem to figure out how. I have only found solutions online where an entire network is used without adding additional layers.
I am inexperienced with Tensorflow.
In the code below you can see what I have tried. I can run the code properly without the additional convolutional layer, however my goal is to add more layers after the ResNet.
With this attempt at adding the extra conv layer, this type error is returned:
TypeError: Expected float32, got OrderedDict([('resnet_v1_101/conv1', ...
Once I have added more layers, I would like to start training on a very small test set to see if my model can overfit.
import tensorflow as tf
import tensorflow.contrib.slim as slim
from tensorflow.contrib.slim.python.slim.nets import resnet_v1
import matplotlib.pyplot as plt
numclasses = 17
from google.colab import drive
def decode_text(filename):
img =
img = tf.image.resize_bilinear(tf.expand_dims(img, 0), [224, 224])
img = tf.squeeze(img, 0)
img.set_shape((None, None, 3))
return img
dataset ='gdrive/My Drive/5LSM0collab/filenames.txt', tf.string))
dataset =
dataset = dataset.batch(2, drop_remainder=True)
img_1 = dataset.make_one_shot_iterator().get_next()
net = resnet_v1.resnet_v1_101(img_1, 2048, is_training=False, global_pool=False, output_stride=8)
net = slim.conv2d(net, numclasses, 1)
sess = tf.Session()
global_init = tf.global_variables_initializer()
local_init = tf.local_variables_initializer()
img_out, conv_out =, net))
resnet_v1.resnet_v1_101 does not return just net, but instead returns a tuple net, end_points. The second element is a dictionary, which is presumably why you are getting this particular error message.
For the documentation of this function:
net: A rank-4 tensor of size [batch, height_out, width_out,
channels_out]. If global_pool is False,
then height_out and width_out are reduced by a
factor of output_stride compared to the respective height_in and width_in,
else both height_out and width_out equal one. If num_classes is 0 or None,
then net is the output of the last ResNet block, potentially after global
average pooling. If num_classes a non-zero integer, net contains the
pre-softmax activations.
end_points: A dictionary from components of the network to the corresponding
So you can write for example:
net, _ = resnet_v1.resnet_v1_101(img_1, 2048, is_training=False, global_pool=False, output_stride=8)
net = slim.conv2d(net, numclasses, 1)
You can also choose an intermediate layer, e.g.:
_, end_points = resnet_v1.resnet_v1_101(img_1, 2048, is_training=False, global_pool=False, output_stride=8)
net = slim.conv2d(end_points["main_Scope/resnet_v1_101/block3"], numclasses, 1)
(you can look into end_points to find the names of the endpoints. Your scope name will be different than main_Scope.)

How to create a serving_input_fn in Tensorflow 2.0 for image preprocessing?

I am using Tensorflow 2.0 and am able to train a CNN for image classification of 3-channel images. I perform image preprocessing within the data input pipeline (shown below) and would like to include the preprocessing functionality in the served model itself. My model is served with a TF Serving Docker container and the Predict API.
The data input pipeline for training is based on the documentation at
My pipeline image preprocessing function is load_and_preprocess_from_path_label:
def load_and_preprocess_path(image_path):
# Load image
image =
image = tf.image.decode_png(image)
# Normalize to [0,1] range
image /= 255
# Convert to HSV and Resize
image = tf.image.rgb_to_hsv(image)
image = tf.image.resize(image, [HEIGHT, WIDTH])
return image
def load_and_preprocess_from_path_label(image_path, label):
return load_and_preprocess_path(image_path), label
With lists of image paths, the pipeline prefetches and performs image preprocessing using tf functions within load_and_preprocess_from_path_label:
all_image_paths, all_image_labels = parse_labeled_image_paths()
x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(all_image_paths, all_image_labels, test_size=0.2)
# Create a TensorFlow Dataset of training images and labels
ds =, y_train))
image_label_ds =
IMAGE_COUNT = len(all_image_paths)
ds = image_label_ds.apply(
ds = ds.batch(BATCH_SIZE)
ds = ds.prefetch(buffer_size=AUTOTUNE)
# Create image pipeline for model
image_batch, label_batch = next(iter(ds))
feature_map_batch = model(image_batch)
# Train model, epochs=5)
Previous Tensorflow examples I've found use serving_input_fn(), and utilized tf.placeholder which seems to no longer exist in Tensorflow 2.0.
An example for serving_input_fn in Tensorflow 2.0 is shown on Since I am using the Predict API, it looks like I would need something similar to:
serving_input_fn = tf.estimator.export.build_raw_serving_input_receiver_fn(...)
# Save the model with the serving preprocessing function
model.export_saved_model(MODEL_PATH, serving_input_fn)
Ideally, the served model would accept a 4D Tensor of 3-channel image samples of any size and would perform the initial image preprocessing on them (decode image, normalize, convert to HSV, and resize) before classifying.
How can I create a serving_input_fn in Tensorflow 2.0 with a preprocessing function similar to my load_and_preprocess_path function?
I faced a similar issue when upgrading. It appears that the way to achieve this in Tensorflow 2 is to provide a function which the saved model can use to make the predictions, something like:
def serve_load_and_preprocess_path(image_paths: tf.Tensor[tf.string]):
# loaded images may need converting to the tensor shape needed for the model
loaded_images = tf.map_fn(load_and_preprocess_path, image_paths, dtype=tf.float32)
predictions = model(loaded_images)
return predictions
serve_load_and_preprocess_path = tf.function(serve_load_and_preprocess_path)
serve_load_and_preprocess_path = serve_load_and_preprocess_path.get_concrete_function(
image_paths=tf.TensorSpec([None,], dtype=tf.string))
# check the models give the same output
loaded = tf.saved_model.load(MODEL_PATH)
loaded_model_predictions = loaded.serve_load_and_preprocess_path(...)
np.testing.assert_allclose(trained_model_predictions, loaded_model_predictions, atol=1e-6)
Expanding and simplifying #harry-salmon answer. For me the following worked:
def save_model_with_serving_signature(model, model_path):
#tf.function(input_signature=[tf.TensorSpec(shape=[None, ], dtype=tf.string)])
def serve_load_and_preprocess_path(image_paths):
return model(tf.map_fn(load_and_preprocess_path, image_paths, dtype=tf.float32))
Note: dtype=tf.float32 in map function was important and didn't work without it. I found solution here. Also I simplified the concrete function work by simply adding a decorator (see this for details).

Error when call prediction with base 64 input

I am using Tensorflow hub's example to export a saved_model to be serve with Tensorflow serving using Docker. (
I just followed some instruction on the internet and modified the export_model like below
def export_model(module_spec, class_count, saved_model_dir):
"""Exports model for serving.
module_spec: The hub.ModuleSpec for the image module being used.
class_count: The number of classes.
saved_model_dir: Directory in which to save exported model and variables.
# The SavedModel should hold the eval graph.
sess, in_image, _, _, _, _ = build_eval_session(module_spec, class_count)
# Shape of [None] means we can have a batch of images.
image = tf.placeholder(shape=[None], dtype=tf.string)
with sess.graph.as_default() as graph:
#inputs={'image': in_image},
inputs = {'image_bytes': image},
outputs={'prediction': graph.get_tensor_by_name('final_result:0')},, name='legacy_init_op')
The problem is when i try to call the api using postman it came with this error
"error": "Tensor Placeholder_1:0, specified in either feed_devices or fetch_devices was not found in the Graph"
Do I need to modify the retraining process so it can accept base64 input?