How to feed inputs into a loaded Tensorflow model using C++ - tensorflow

I want to create and train a model, export it and run inference in C++.
I'm following the tutorial listed here: https://www.tensorflow.org/tutorials/wide_and_deep
I'm also trying to use the SavedModel approach as described here since this is the canonical way to export TensorFlow graphs for serving:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/README.md.
At the very end, I export the saved model as follows:
feature_spec = tf.contrib.layers.create_feature_spec_for_parsing(feature_columns)
serving_input_fn = input_fn_utils.build_parsing_serving_input_fn(feature_spec)
output = model.export_savedmodel(model_dir, serving_input_fn, as_text=True)
print('Model saved to {}'.format(output))
I see the saved_model.pbtxt has the following signature definition.
signature_def {
key: "serving_default"
value {
inputs {
key: "inputs"
value {
name: "input_example_tensor:0"
dtype: DT_STRING
tensor_shape {
dim {
size: -1
}
}
}
}
outputs {
...
I can load the saved model on the C++ side
SavedModelBundle bundle;
const std::string graph_path = "models/1498572863";
const std::unordered_set<std::string> tags = {"serve"};
Status status = LoadSavedModel(session_options,
run_options, graph_path,
tags, &bundle);
I'm stuck at the last part where I need to feed the input into this model.
The Run function expects the input parameter to be of the form: std::vector<std::pair<string, Tensor>>.
I would have expected this to be a vector of pairs where the key is the feature name used in the python code and the Tensor is multiple values for that feature.
However, it seems to expect the string to be "input_example_tensor".
I'm not sure how I'm supposed to now feed the model with different features using a single Tensor.
std::vector<string> output_tensor_names = {
"binary_logistic_head/_classification_output_alternatives/classes_tensor"};
// How do I create input_tensor?
status = bundle.session->Run({{"input_example_tensor", input_tensor}}
output_tensor_names, {}, &outputs);
Solution
I did something like this
tensorflow::Example example;
auto& tf_feature_map = *(example.mutable_features()->mutable_feature());
tf_feature_map["name"].mutable_int64_list()->add_value(15);
const std::string& serialized = example.SerializeAsString();
tensorflow::Input input({serialized});
status = bundle.session->Run({{"input_example_tensor", input.tensor()}}
output_tensor_names, {}, &outputs);

Your model signature suggests that it is expecting a DT_STRING tensor as input. When using tensorflow::Example, this typically means that the protocol buffer needs to be serialized into a tensor with a string as the type of its elements.
To convert the tensorflow::Example object to a string, you can use the protocol buffer methods such as SerializeToString, SerializeAsString etc.
Hope that helps.

Related

Vertex AI Model Batch prediction, issue with referencing existing model and input file on Cloud Storage

I'm struggling to correctly set Vertex AI pipeline which does the following:
read data from API and store to GCS and as as input for batch prediction.
get an existing model (Video classification on Vertex AI)
create Batch prediction job with input from point 1.
As it will be seen, I don't have much experience with Vertex Pipelines/Kubeflow thus I'm asking for help/advice, hope it's just some beginner mistake.
this is the gist of the code I'm using as pipeline
from google_cloud_pipeline_components import aiplatform as gcc_aip
from kfp.v2 import dsl
from kfp.v2.dsl import component
from kfp.v2.dsl import (
Output,
Artifact,
Model,
)
PROJECT_ID = 'my-gcp-project'
BUCKET_NAME = "mybucket"
PIPELINE_ROOT = "{}/pipeline_root".format(BUCKET_NAME)
#component
def get_input_data() -> str:
# getting data from API, save to Cloud Storage
# return GS URI
gcs_batch_input_path = 'gs://somebucket/file'
return gcs_batch_input_path
#component(
base_image="python:3.9",
packages_to_install=['google-cloud-aiplatform==1.8.0']
)
def load_ml_model(project_id: str, model: Output[Artifact]):
"""Load existing Vertex model"""
import google.cloud.aiplatform as aip
model_id = '1234'
model = aip.Model(model_name=model_id, project=project_id, location='us-central1')
#dsl.pipeline(
name="batch-pipeline", pipeline_root=PIPELINE_ROOT,
)
def pipeline(gcp_project: str):
input_data = get_input_data()
ml_model = load_ml_model(gcp_project)
gcc_aip.ModelBatchPredictOp(
project=PROJECT_ID,
job_display_name=f'test-prediction',
model=ml_model.output,
gcs_source_uris=[input_data.output], # this doesn't work
# gcs_source_uris=['gs://mybucket/output/'], # hardcoded gs uri works
gcs_destination_output_uri_prefix=f'gs://{PIPELINE_ROOT}/prediction_output/'
)
if __name__ == '__main__':
from kfp.v2 import compiler
import google.cloud.aiplatform as aip
pipeline_export_filepath = 'test-pipeline.json'
compiler.Compiler().compile(pipeline_func=pipeline,
package_path=pipeline_export_filepath)
# pipeline_params = {
# 'gcp_project': PROJECT_ID,
# }
# job = aip.PipelineJob(
# display_name='test-pipeline',
# template_path=pipeline_export_filepath,
# pipeline_root=f'gs://{PIPELINE_ROOT}',
# project=PROJECT_ID,
# parameter_values=pipeline_params,
# )
# job.run()
When running the pipeline it throws this exception when running Batch prediction:
details = "List of found errors: 1.Field: batch_prediction_job.model; Message: Invalid Model resource name.
so I'm not sure what could be wrong. I tried to load model in the notebook (outside of component) and it correctly returns.
Second issue I'm having is referencing GCS URI as output from component to batch job input.
input_data = get_input_data2()
gcc_aip.ModelBatchPredictOp(
project=PROJECT_ID,
job_display_name=f'test-prediction',
model=ml_model.output,
gcs_source_uris=[input_data.output], # this doesn't work
# gcs_source_uris=['gs://mybucket/output/'], # hardcoded gs uri works
gcs_destination_output_uri_prefix=f'gs://{PIPELINE_ROOT}/prediction_output/'
)
During compilation, I get following exception TypeError: Object of type PipelineParam is not JSON serializable, though I think this could be issue of ModelBatchPredictOp component.
Again any help/advice appreciated, I'm dealing with this from yesterday, so maybe I missed something obvious.
libraries I'm using:
google-cloud-aiplatform==1.8.0
google-cloud-pipeline-components==0.2.0
kfp==1.8.10
kfp-pipeline-spec==0.1.13
kfp-server-api==1.7.1
UPDATE
After comments, some research and tuning, for referencing model this works:
#component
def load_ml_model(project_id: str, model: Output[Artifact]):
region = 'us-central1'
model_id = '1234'
model_uid = f'projects/{project_id}/locations/{region}/models/{model_id}'
model.uri = model_uid
model.metadata['resourceName'] = model_uid
and then I can use it as intended:
batch_predict_op = gcc_aip.ModelBatchPredictOp(
project=gcp_project,
job_display_name=f'batch-prediction-test',
model=ml_model.outputs['model'],
gcs_source_uris=[input_batch_gcs_path],
gcs_destination_output_uri_prefix=f'gs://{BUCKET_NAME}/prediction_output/test'
)
UPDATE 2
regarding GCS path, a workaround is to define path outside of the component and pass it as an input parameter, for example (abbreviated):
#dsl.pipeline(
name="my-pipeline",
pipeline_root=PIPELINE_ROOT,
)
def pipeline(
gcp_project: str,
region: str,
bucket: str
):
ts = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
gcs_prediction_input_path = f'gs://{BUCKET_NAME}/prediction_input/video_batch_prediction_input_{ts}.jsonl'
batch_input_data_op = get_input_data(gcs_prediction_input_path) # this loads input data to GCS path
batch_predict_op = gcc_aip.ModelBatchPredictOp(
project=gcp_project,
model=training_job_run_op.outputs["model"],
job_display_name='batch-prediction',
# gcs_source_uris=[batch_input_data_op.output],
gcs_source_uris=[gcs_prediction_input_path],
gcs_destination_output_uri_prefix=f'gs://{BUCKET_NAME}/prediction_output/',
).after(batch_input_data_op) # we need to add 'after' so it runs after input data is prepared since get_input_data doesn't returns anything
still not sure, why it doesn't work/compile when I return GCS path from get_input_data component
I'm glad you solved most of your main issues and found a workaround for model declaration.
For your input.output observation on gcs_source_uris, the reason behind it is because the way the function/class returns the value. If you dig inside the class/methods of google_cloud_pipeline_components you will find that it implements a structure that will allow you to use .outputs from the returned value of the function called.
If you go to the implementation of one of the components of the pipeline you will find that it returns an output array from convert_method_to_component function. So, in order to have that implemented in your custom class/function your function should return a value which can be called as an attribute. Below is a basic implementation of it.
class CustomClass():
def __init__(self):
self.return_val = {'path':'custompath','desc':'a desc'}
#property
def output(self):
return self.return_val
hello = CustomClass()
print(hello.output['path'])
If you want to dig more about it you can go to the following pages:
convert_method_to_component, which is the implementation of convert_method_to_component
Properties, basics of property in python.

How to send numpy array to sagemaker endpoint using lambda function

How to invoke sagemaker endpoint with input data type numpy.ndarray.
I have deployed a sagemaker model and trying to hit it using lambda function.
But I am unable to figure out how to do it. I am getting server error.
One row of the Input data.
The total data set has shape=(91,5,12).
The below is only one row of Input data.
array([[[0.30440741, 0.30209799, 0.33520652, 0.41558442, 0.69096432,
0.69611016, 0.25153326, 0.98333333, 0.82352941, 0.77187154,
0.7664042 , 0.74468085],
[0.30894981, 0.33151662, 0.22907725, 0.46753247, 0.69437367,
0.70410559, 0.29259044, 0.9 , 0.80882353, 0.79401993,
0.89501312, 0.86997636],
[0.33511896, 0.34338939, 0.24065546, 0.48051948, 0.70384005,
0.71058715, 0.31031288, 0.86666667, 0.89705882, 0.82724252,
0.92650919, 0.89125296],
[0.34617355, 0.36150251, 0.23726854, 0.54545455, 0.71368726,
0.71703244, 0.30228356, 0.85 , 0.86764706, 0.86157254,
0.97112861, 0.94089835],
[0.36269508, 0.35923332, 0.40285461, 0.62337662, 0.73325475,
0.7274392 , 0.26241391, 0.85 , 0.82352941, 0.89922481,
0.9343832 , 0.90780142]]])
I am using the following code but unable to invoke the endpoint
import boto3
def lambda_handler(event, context):
# The SageMaker runtime is what allows us to invoke the endpoint that we've created.
runtime = boto3.Session().client('sagemaker-runtime')
endpoint = 'sagemaker-tensorflow-2019-04-22-07-16-51-717'
print('givendata ', event['body'])
# data = numpy.array([numpy.array(xi) for xi in event['body']])
data = event['body']
print('numpy array ', data)
# Now we use the SageMaker runtime to invoke our endpoint, sending the review we were given
response = runtime.invoke_endpoint(EndpointName = endpoint,# The name of the endpoint we created
ContentType = 'application/json', # The data format that is expected
Body = data) # The actual review
# The response is an HTTP response whose body contains the result of our inference
result = response['Body'].read().decode('utf-8')
print('response', result)
# Round the result so that our web app only gets '1' or '0' as a response.
result = round(float(result))
return {
'statusCode' : 200,
'headers' : { 'Content-Type' : 'text/plain', 'Access-Control-Allow-Origin' : '*' },
'body' : str(result)
}
I am unable to figure out what should be written in place of ContentType.
Because I am not aware of MIME type in case of numpy.ndarray.
Illustration of what I had and how I solved
from sagemaker.tensorflow import TensorFlowPredictor
predictor = TensorFlowPredictor('sagemaker-tensorflow-serving-date')
data = np.array(raw_data)
response = predictor.predict(data=data)
predictions = response['predictions']
print(predictions)
What I did to find the answer:
Looked up predictions.py and content_types.py implementation in sagemaker python library to see what content types it used and what arguments it had.
First I thought that application/x-npy content_type was used and thus tried using serialisation code from predictor.py and passing the application/x-npy as content_type to invoke_endpoint.
After receiving 415 (unsupported media type), the issue was still the content_type. The following print statements helped me to reveal what content_type predictor actually uses (application/json) and thus I took the appropriate serialisation code from predictor.py
from sagemaker.tensorflow import TensorFlowPredictor
predictor = TensorFlowPredictor('sagemaker-tensorflow-serving-date')
data = np.array(raw_data)
response = predictor.predict(data=data)
print(predictor.content_type)
print(predictor.accept)
predictions = response['predictions']
print(predictions)
TL;DR
Solution for lambda:
import json
import boto3
ENDPOINT_NAME = 'sagemaker-tensorflow-serving-date'
config = botocore.config.Config(read_timeout=80)
runtime= boto3.client('runtime.sagemaker', config=config)
data = np.array(raw_data)
payload = json.dumps(data.tolist())
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
ContentType='application/json',
Body=payload)
result = json.loads(response['Body'].read().decode())
res = result['predictions']
Note: numpy is not included in lambda thus you would either include the numpy yourself or instead of data.tolist() operate with python list and json.dump that list (of lists). From your code it seems to me you have python list instead of numpy array, so simple json dump should work.
If you are training and hosting custom algorithm on SageMaker using TensorFlow, you can serialize/de-serialize the request and response format as JSON as in TensorFlow Serving Predict API.
import numpy
from sagemaker.predictor import json_serializer, json_deserializer
# define predictor
predictor = estimator.deploy(1, instance_type)
# format request
data = {'instances': numpy.asarray(np_array).astype(float).tolist()}
# set predictor request/response formats
predictor.accept = 'application/json'
predictor.content_type = 'application/json'
predictor.serializer = json_serializer
predictor.deserializer = json_deserializer
# run inference using SageMaker predict class
# https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/predictor.py
predictor.predict(data)
You can refer the example notebook here to train and host custom TensorFlow container.

How to create a model for csv data set and to calculate predicted result using TensorFlow JS

I am new to TensorFlow JS.
I followed the TensorFlow JS documentation to create model and train it to calculate predicted result from the created model.
But I don't know how to train the created model for CSV file and calculate predicted result for two or more column in the CSV file.
Could anybody guide me with a sample for creating, training model with CSV file and to calculate the predicted result?
const csvUrl = 'https://storage.googleapis.com/tfjs-examples/multivariate-linear-regression/data/boston-housing-train.csv';
function save(model) {
return model.save('downloads://boston_model');
}
function load() {
return tf.loadModel('indexeddb://boston_model');
}
async function run() {
// We want to predict the column "medv", which represents a median value of a
// home (in $1000s), so we mark it as a label.
const csvDataset = tf.data.csv(
csvUrl, {
columnConfigs: {
medv: {
isLabel: true
}
}
});
// Number of features is the number of column names minus one for the label
// column.
const numOfFeatures = (await csvDataset.columnNames()).length - 1;
// Prepare the Dataset for training.
const flattenedDataset =
csvDataset
.map(([rawFeatures, rawLabel]) =>
// Convert rows from object form (keyed by column name) to array form.
[Object.values(rawFeatures), Object.values(rawLabel)])
.batch(10);
// Define the model.
const model = tf.sequential();
model.add(tf.layers.dense({
inputShape: [numOfFeatures],
units: 1
}));
model.compile({
optimizer: tf.train.sgd(0.000001),
loss: 'meanSquaredError'
});
// Fit the model using the prepared Dataset
model.fitDataset(flattenedDataset, {
epochs: 10,
callbacks: {
onEpochEnd: async (epoch, logs) => {
console.log(epoch, logs.loss);
}
}
});
const savedModel=save(model);
}
run().then(() => console.log('Done'));
Using tf.data.csv you can train a model with a csv file.
However the browser cannot directly read a file. Thus, you have to serve the csv file on a local server
Update
Your model uses only a single perceptron. Using multiple perceptrons can help improving the accuracy of the model, ie adding multiple layers. You can have a look here as how it is done.

Tensorflow serving: Unable to base64 decode

I use the slim package resnet_v2_152 to train a classification model.
Then it is exported to .pb file to provide a service.
Because the input is image, so it would be encoded with web-safe base64 encoding. It looks like:
serialized_tf_example = tf.placeholder(dtype=tf.string, name='tf_example')
decoded = tf.decode_base64(serialized_tf_example)
I then encode an image with base64 such that:
img_path = '/Users/wuyanxue/Desktop/not_emoji1.jpeg'
img_b64 = base64.b64encode(open(img_path, 'rb').read())
s = str(img_b64, encoding='utf-8')
s = s.replace('+', '-').replace(r'/', '_')
My post data is as structured as follow:
post_data = {
'signature_name': 'predict',
'instances':[ {
'inputs':
{ 'b64': s }
}]
}
Finally, I post a HTTP request to this server:
res = requests.post('server_address', json=post_data)
It gives me:
'{ "error": "Failed to process element: 0 key: inputs of \\\'instances\\\' list. Error: Invalid argument: Unable to base64 decode" }'
I want to know how it could be encountered? And are there some solutions for that?
I had the same issue when using python3. I solved it by adding a 'b' - a byte-like object instead of the default str to the encode function:
b'{"instances" : [{"b64": "%s"}]}' % base64.b64encode(
dl_request.content)
Hope that helps, please see this answer for extra info.
This question is already solved.
post_data = {
'signature_name': 'predict',
'instances':[ {
'inputs':
{ 'b64': s }
}]
}
We see that inputs is with 'b64' flag, which illustrates that tensorflow serving will decode s with base64 code.
It belongs to the tensorflow serving internal method.
So, the placeholder:
serialized_tf_example = tf.placeholder(dtype=tf.string, name='tf_example')
will directly receive the binary format of the input data BUT NOT base64 format.
So, finally,
decoded = tf.decode_base64(serialized_tf_example)
is NOT necessary.

How to pass list/array of request objects to tensorflow serving in one server call?

After loading the wide and deep model, i was able to make prediction for one request object using the map of features and then serializing it to string for predictions as shown below-
is there a way we can create a batch of requests objects and send them for prediction to tensorflow server?
Code for single prediction looks like this-
for (each feature in feature list) {
Feature feature = null;
feature = Feature.newBuilder().setBytesList(BytesList.newBuilder().addValue(ByteString.copyFromUtf8("dummy string"))).build();
if (feature != null) {
inputFeatureMap.put(name, feature);
}
}
//Converting features(in inputFeatureMap) corresponding to one request into 'Features' Proto object
Features features = Features.newBuilder().putAllFeature(inputFeatureMap).build();
inputStr = Example.newBuilder().setFeatures(features).build().toByteString();
}
TensorProto proto = TensorProto.newBuilder()
.addStringVal(inputStr)
.setTensorShape(TensorShapeProto.newBuilder().addDim(TensorShapeProto.Dim.newBuilder().setSize(1).build()).build())
.setDtype(DataType.DT_STRING)
.build();
PredictRequest req = PredictRequest.newBuilder()
.setModelSpec(ModelSpec.newBuilder()
.setName("your serving model name")
.setSignatureName("serving_default")
.setVersion(Int64Value.newBuilder().setValue(modelVer)))
.putAllInputs(ImmutableMap.of("inputs", proto))
.build();
PredictResponse response = stub.predict(req);
System.out.println(response.getOutputsMap());
Is there a way we can send the list of Features Object for predictions, something similar to this-
List<Features> = {someway to create array/list of inputFeatureMap's which can be converted to serialized string.}
For anyone stumbling here, I found a simple workaround with Example proto to do batch request. I will borrow code from this question and modify it for the batch.
Features features =
Features.newBuilder()
.putFeature("Attribute1", feature("A12"))
.putFeature("Attribute2", feature(12))
.putFeature("Attribute3", feature("A32"))
.putFeature("Attribute4", feature("A40"))
.putFeature("Attribute5", feature(7472))
.putFeature("Attribute6", feature("A65"))
.putFeature("Attribute7", feature("A71"))
.putFeature("Attribute8", feature(1))
.putFeature("Attribute9", feature("A92"))
.putFeature("Attribute10", feature("A101"))
.putFeature("Attribute11", feature(2))
.putFeature("Attribute12", feature("A121"))
.putFeature("Attribute13", feature(24))
.putFeature("Attribute14", feature("A143"))
.putFeature("Attribute15", feature("A151"))
.putFeature("Attribute16", feature(1))
.putFeature("Attribute17", feature("A171"))
.putFeature("Attribute18", feature(1))
.putFeature("Attribute19", feature("A191"))
.putFeature("Attribute20", feature("A201"))
.build();
Example example = Example.newBuilder().setFeatures(features).build();
String pfad = System.getProperty("user.dir") + "\\1511523781";
try (SavedModelBundle model = SavedModelBundle.load(pfad, "serve")) {
Session session = model.session();
final String xName = "input_example_tensor";
final String scoresName = "dnn/head/predictions/probabilities:0";
try (Tensor<String> inputBatch = Tensors.create(new byte[][] {example.toByteArray(), example.toByteArray(), example.toByteArray(), example.toByteArray()});
Tensor<Float> output =
session
.runner()
.feed(xName, inputBatch)
.fetch(scoresName)
.run()
.get(0)
.expect(Float.class)) {
System.out.println(Arrays.deepToString(output.copyTo(new float[4][2])));
}
}
Essentially you can pass each example as an object in byte[4][] and you will receive the result in the same shape float[4][2]