I am serving up the inception model using TensorFlow serving. I am doing this on Azure Kubernetes so not via the more standard and well documented google cloud.
In any event, this is all working however the bit i am confused about is the predictions come back as an array of floats. These values map to the original labels passed in during training but without the original labels file there is no way to reverse engineer what each probability relates to.
Before I moved to serving i was simply using an inference script that then cross references against the labels file which i stored along with the frozen model at time of training. But with serving this does not work.
So my question is how can i get the labels associated with the model and ideally get the prediction to return the labels and probabilities?
i tried the approach suggested by #user1371314 but i couldn't get it to work. An other solution that worked is creating a tensor (instead of a constant) and map it with only the first element of the output layer when saving the model. When you put it together it looks like this :
# get labels names and create a tensor from it
label_names_tensor = tf.convert_to_tensor(label_names)
# save the model and map the labels to the output layer
inputs={'image': model.input},
outputs={'label' : label_names_tensor,'prediction': model.output[0]})
When you make a prediction after serving your model you will get the following result:
"predictions": [
"label": "label-name",
"prediction": 0.114107
"label": "label-name",
"prediction": 0.288598
"label": "label-name",
"prediction": 0.17436
"label": "label-name",
"prediction": 0.186366
"label": "label-name",
"prediction": 0.236568
I am sure there is a way to return a mapping directly for this using the various TF ops however I have managed to at least package the labels into the model and return them in the prediction along with the probabilities.
What i did was create a tf.constant from the labels array and then added that tensor to the array of output tensors in tf.saved_model.signature_def_utils.build_signature_def
Now when i get a prediction i get the float array and also an array of labels and i can match them up on the client side.
I'm trying to save my model so that when called from tf-serving the output is:
"results": [
{ "label1": x.xxxxx, "label2": x.xxxxx },
{ "label1": x.xxxxx, "label2": x.xxxxx }
where label1 and label2 are my labels and x.xxxxx are the probability of that label.
This is what I'm trying:
class TFModel(tf.Module):
def __init__(self, model: tf.keras.Model) -> None:
self.labels = ['label1', 'label2']
self.model = model
#tf.function(input_signature=[tf.TensorSpec(shape=(1, ), dtype=tf.string)])
def prediction(self, pagetext: str):
{ 'results': tf.constant([{k: v for dct in [{self.labels[c]: f"{x:.5f}"} for (c,x) in enumerate(results[i])] for k, v in dct.items()}
for i in range(len(results.numpy()))])}
# and then save it:
tf_model_wrapper = TFModel(classifier_model)
Side Note: Apparently in TensorFlow v2.0 if signatures is omitted it should scan the object for the first #tf.function (according to this: https://www.tensorflow.org/api_docs/python/tf/saved_model/save) but in reality that doesn't seem to work. Instead, the model saves successfully with no errors and the #tf.function is not called, but default output is returned instead.
The error I get from the above is:
ValueError: Got a non-Tensor value <tf.Operation 'PartitionedCall' type=PartitionedCall> for key 'output_0' in the output of the function __inference_prediction_125493 used to generate the SavedModel signature 'serving_default'. Outputs for functions used as signatures must be a single Tensor, a sequence of Tensors, or a dictionary from string to Tensor.
I wrapped the result in tf.constant above because of this error, thinking it might be a quick fix, but I think it's me just being naive and not understanding Tensors properly.
I tried a bunch of other things before learning that [all outputs must be return values].1
How can I change the output to be as I want it to be?
You can see a Tensor as a multidimensional vector, i.e a structure with a fixed size and dimension and containing elements sharing the same type. Your return value is a map between a string and a list of dictionaries. A list of dictionaries cannot be converted to a tensor, because there is no guarantee that the number of dimensions and their size is constant, nor a guarantee that each element is sharing the same type.
You could instead return the raw output of your network, which should be a tensor and do your post processing outside of tensorflow-serving.
If you really want to do something like in your question, you can use a Tensor of strings instead, and you could use some code like that:
labels = tf.constant(['label1', 'label2'])
# if your batch size is dynamic, you can use tf.shape on your results variable to find it at runtime
batch_size = 32
# assuming your model returns something with the shape (N,2)
results = tf.random.uniform((batch_size,2))
res_as_str = tf.strings.as_string(results, precision=5)
return {
"results": tf.stack(
[tf.tile(labels[None, :], [batch_size, 1]), res_as_str], axis=-1
The output will be a dictionary mapping the value "results" to a Tensor of dimensions (Batch, number of labels, 2), the last dimension containing the label name and its corresponding value.
I am using the Language Interpretability Toolkit (LIT) to load and analyze a BERT model that I pre-trained on an NER task.
However, when I'm starting the LIT script with the path to my pre-trained model passed to it, it fails to initialize the weights and tells me:
modeling_utils.py:648] loading weights file bert_remote/examples/token-classification/Data/Models/results_21_03_04_cleaned_annotations/04.03._8_16_5e-5_cleaned_annotations/04-03-2021 (15.22.23)/pytorch_model.bin
modeling_utils.py:739] Weights of BertForTokenClassification not initialized from pretrained model: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
modeling_utils.py:745] Weights from pretrained model not used in BertForTokenClassification: ['bert.embeddings.position_ids']
It then simply uses the bert-base-german-cased version of BERT, which of course doesn't have my custom labels and thus fails to predict anything. I think it might have to do with PyTorch, but I can't find the error.
If relevant, here is how I load my dataset into CoNLL 2003 format (modification of the dataloader scripts found here):
def __init__(self):
# Read ConLL Test Files
self._examples = []
data_path = "lit_remote/lit_nlp/examples/datasets/NER_Data"
with open(os.path.join(data_path, "test.txt"), "r", encoding="utf-8") as f:
lines = f.readlines()
for line in lines[:2000]:
if line != "\n":
token, label = line.split(" ")
'token': token,
'label': label,
'token': "\n",
'label': "O"
def spec(self):
return {
'token': lit_types.Tokens(),
'label': lit_types.SequenceTags(align="token"),
And this is how I initialize the model and start the LIT server (modification of the simple_pytorch_demo.py script found here):
def __init__(self, model_name_or_path):
self.tokenizer = transformers.AutoTokenizer.from_pretrained(
model_config = transformers.AutoConfig.from_pretrained(
num_labels=15, # FIXME CHANGE
# This is a just a regular PyTorch model.
self.model = _from_pretrained(
## Some omitted snippets here
def input_spec(self) -> lit_types.Spec:
return {
"token": lit_types.Tokens(),
"label": lit_types.SequenceTags(align="token")
def output_spec(self) -> lit_types.Spec:
return {
"tokens": lit_types.Tokens(),
"probas": lit_types.MulticlassPreds(parent="label", vocab=self.LABELS),
"cls_emb": lit_types.Embeddings()
This actually seems to be expected behaviour. In the documentation of the GPT models the HuggingFace team writes:
This will issue a warning about some of the pretrained weights not being used and some weights being randomly initialized. That’s because we are throwing away the pretraining head of the BERT model to replace it with a classification head which is randomly initialized.
So it seems to not be a problem for the fine-tuning. In my use case described above it worked despite the warning as well.
simple question and im sure answer is straightforward but im really struggling to match model shape with tensor fitting into model.
this simple code
let tf = require('#tensorflow/tfjs-node');
let features = {
x: [1,2,3,4,5,6,7,8,9],
y: [1,2,3,4,5,6,7,8,9]
let tensorfeature = tf.tensor2d(Object.values(features))
const model = tf.sequential();
inputShape: tensorfeature.shape,
units: 1
const optimizer = tf.train.sgd(0.005);
model.compile({optimizer: optimizer, loss: 'meanAbsoluteError'});
{epochs: 5}
Results in Error: Error when checking input: expected dense_Dense1_input to have 3 dimension(s). but got array with shape 2,9
tried multiple things with reshape, slice, etc with no luck. Can someone point me what exactly is wrong?
model.fit takes at least two parameters x, y which are either tensors or array of tensors. The config object is the third parameter.
Also, the feature(tensorfeature) tensor passed as argument to model.fit should be one dimension higher than the inputShape of the model. Since tensorfeature.shape is used as the inputShape, if we want to traing the model with tensorfeature its dimension should be expanded. It can be done using reshape or expandDims.
// or possibly
model.fit(tensorfeature.reshape([1, ...tensorfeature.shape])
This shape mismatch between the model and the training data has been discussed here and there
I have a model saved in SavedModel format (.pb). After serving the model without problems i try to make a prediction via tensorflow serving. TF Serving requires me to input the data via a list, otherwise the answer i receive is TypeError: Object of type 'ndarray' is not JSON serializable
. But when i input a list the response is an error
The input is
value = [1, 2, 3, 4, 5]
body = {"signature_name": "serving_default",
"instances": [[values]]}
res = requests.post(url=url, data=json.dumps(body))
and the answer { "error": "In[0] is not a matrix. Instead it has shape [1,1,5]\n\t [[{{node sequential/dense/Relu}}]]" }
I know the model works, the input without using tensorflow serving is
value = np.array([1,2,3,4,5])
So the problem is how can use tensorflow serving if it requires to use a list as input but the model requires a np.array as input.
I suppose you should do it in this way
value = <ndarray>
data = value.tolist()
body = {
"signature_name": "serving_default",
"instances": data}
I currently follow the tutorial to retrain Inception for image classification:
However, when I make a prediction with the API I get only the index of my class as a label. However I would like that the API actually gives me a string back with the actual class name e.g instead of
- key: '0'
prediction: 4
- 8.11998e-09
- 2.64907e-08
- 1.10307e-06
I would like to get:
- key: '0'
prediction: ROSES
- 8.11998e-09
- 2.64907e-08
- 1.10307e-06
Looking at the reference for the Google API it should be possible:
I already tried to change in the model.py the following to
outputs = {
'key': keys.name,
'prediction': tensors.predictions[0].name,
'scores': tensors.predictions[1].name
tf.add_to_collection('outputs', json.dumps(outputs))
if tensors.predictions[0].name == 0:
pred_name ='roses'
elif tensors.predictions[0].name == 1:
pred_name ='tulips'
outputs = {
'key': keys.name,
'prediction': pred_name,
'scores': tensors.predictions[1].name
tf.add_to_collection('outputs', json.dumps(outputs))
but this doesn't work.
My next idea was to change this part in the preprocess.py file. So instead getting the index I want to use the string label.
def process(self, row, all_labels):
row = row.element
except AttributeError:
if not self.label_to_id_map:
for i, label in enumerate(all_labels):
label = label.strip()
if label:
self.label_to_id_map[label] = label #i
label_ids = []
for label in row[1:]:
except KeyError:
but this gives the error:
TypeError: 'roses' has type <type 'str'>, but expected one of: (<type 'int'>, <type 'long'>) [while running 'Embed and make TFExample']
hence I thought that I should change something here in preprocess.py, in order to allow strings:
example = tf.train.Example(features=tf.train.Features(feature={
'image_uri': _bytes_feature([uri]),
'embedding': _float_feature(embedding.ravel().tolist()),
if label_ids:
But I don't know how to change it appropriately as I could not find someting like str_list. Could anyone please help me out here?
Online prediction certainly allows this, the model itself needs to be updated to do the conversion from int to string.
Keep in mind that the Python code is just building a graph which describes what computation to do in your model -- you're not sending the Python code to online prediction, you're sending the graph you build.
That distinction is important because the changes you have made are in Python -- you don't yet have any inputs or predictions, so you won't be able to inspect their values. What you need to do instead is add the equivalent lookups to the graph that you're exporting.
You could modify the code like so:
labels = tf.constant(['cars', 'trucks', 'suvs'])
predicted_indices = tf.argmax(softmax, 1)
prediction = tf.gather(labels, predicted_indices)
And leave the inputs/outputs untouched from the original code