How to send numpy array to sagemaker endpoint using lambda function - numpy

How to invoke sagemaker endpoint with input data type numpy.ndarray.
I have deployed a sagemaker model and trying to hit it using lambda function.
But I am unable to figure out how to do it. I am getting server error.
One row of the Input data.
The total data set has shape=(91,5,12).
The below is only one row of Input data.
array([[[0.30440741, 0.30209799, 0.33520652, 0.41558442, 0.69096432,
0.69611016, 0.25153326, 0.98333333, 0.82352941, 0.77187154,
0.7664042 , 0.74468085],
[0.30894981, 0.33151662, 0.22907725, 0.46753247, 0.69437367,
0.70410559, 0.29259044, 0.9 , 0.80882353, 0.79401993,
0.89501312, 0.86997636],
[0.33511896, 0.34338939, 0.24065546, 0.48051948, 0.70384005,
0.71058715, 0.31031288, 0.86666667, 0.89705882, 0.82724252,
0.92650919, 0.89125296],
[0.34617355, 0.36150251, 0.23726854, 0.54545455, 0.71368726,
0.71703244, 0.30228356, 0.85 , 0.86764706, 0.86157254,
0.97112861, 0.94089835],
[0.36269508, 0.35923332, 0.40285461, 0.62337662, 0.73325475,
0.7274392 , 0.26241391, 0.85 , 0.82352941, 0.89922481,
0.9343832 , 0.90780142]]])
I am using the following code but unable to invoke the endpoint
import boto3
def lambda_handler(event, context):
# The SageMaker runtime is what allows us to invoke the endpoint that we've created.
runtime = boto3.Session().client('sagemaker-runtime')
endpoint = 'sagemaker-tensorflow-2019-04-22-07-16-51-717'
print('givendata ', event['body'])
# data = numpy.array([numpy.array(xi) for xi in event['body']])
data = event['body']
print('numpy array ', data)
# Now we use the SageMaker runtime to invoke our endpoint, sending the review we were given
response = runtime.invoke_endpoint(EndpointName = endpoint,# The name of the endpoint we created
ContentType = 'application/json', # The data format that is expected
Body = data) # The actual review
# The response is an HTTP response whose body contains the result of our inference
result = response['Body'].read().decode('utf-8')
print('response', result)
# Round the result so that our web app only gets '1' or '0' as a response.
result = round(float(result))
return {
'statusCode' : 200,
'headers' : { 'Content-Type' : 'text/plain', 'Access-Control-Allow-Origin' : '*' },
'body' : str(result)
}
I am unable to figure out what should be written in place of ContentType.
Because I am not aware of MIME type in case of numpy.ndarray.

Illustration of what I had and how I solved
from sagemaker.tensorflow import TensorFlowPredictor
predictor = TensorFlowPredictor('sagemaker-tensorflow-serving-date')
data = np.array(raw_data)
response = predictor.predict(data=data)
predictions = response['predictions']
print(predictions)
What I did to find the answer:
Looked up predictions.py and content_types.py implementation in sagemaker python library to see what content types it used and what arguments it had.
First I thought that application/x-npy content_type was used and thus tried using serialisation code from predictor.py and passing the application/x-npy as content_type to invoke_endpoint.
After receiving 415 (unsupported media type), the issue was still the content_type. The following print statements helped me to reveal what content_type predictor actually uses (application/json) and thus I took the appropriate serialisation code from predictor.py
from sagemaker.tensorflow import TensorFlowPredictor
predictor = TensorFlowPredictor('sagemaker-tensorflow-serving-date')
data = np.array(raw_data)
response = predictor.predict(data=data)
print(predictor.content_type)
print(predictor.accept)
predictions = response['predictions']
print(predictions)
TL;DR
Solution for lambda:
import json
import boto3
ENDPOINT_NAME = 'sagemaker-tensorflow-serving-date'
config = botocore.config.Config(read_timeout=80)
runtime= boto3.client('runtime.sagemaker', config=config)
data = np.array(raw_data)
payload = json.dumps(data.tolist())
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
ContentType='application/json',
Body=payload)
result = json.loads(response['Body'].read().decode())
res = result['predictions']
Note: numpy is not included in lambda thus you would either include the numpy yourself or instead of data.tolist() operate with python list and json.dump that list (of lists). From your code it seems to me you have python list instead of numpy array, so simple json dump should work.

If you are training and hosting custom algorithm on SageMaker using TensorFlow, you can serialize/de-serialize the request and response format as JSON as in TensorFlow Serving Predict API.
import numpy
from sagemaker.predictor import json_serializer, json_deserializer
# define predictor
predictor = estimator.deploy(1, instance_type)
# format request
data = {'instances': numpy.asarray(np_array).astype(float).tolist()}
# set predictor request/response formats
predictor.accept = 'application/json'
predictor.content_type = 'application/json'
predictor.serializer = json_serializer
predictor.deserializer = json_deserializer
# run inference using SageMaker predict class
# https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/predictor.py
predictor.predict(data)
You can refer the example notebook here to train and host custom TensorFlow container.

Related

Vertex AI Model Batch prediction, issue with referencing existing model and input file on Cloud Storage

I'm struggling to correctly set Vertex AI pipeline which does the following:
read data from API and store to GCS and as as input for batch prediction.
get an existing model (Video classification on Vertex AI)
create Batch prediction job with input from point 1.
As it will be seen, I don't have much experience with Vertex Pipelines/Kubeflow thus I'm asking for help/advice, hope it's just some beginner mistake.
this is the gist of the code I'm using as pipeline
from google_cloud_pipeline_components import aiplatform as gcc_aip
from kfp.v2 import dsl
from kfp.v2.dsl import component
from kfp.v2.dsl import (
Output,
Artifact,
Model,
)
PROJECT_ID = 'my-gcp-project'
BUCKET_NAME = "mybucket"
PIPELINE_ROOT = "{}/pipeline_root".format(BUCKET_NAME)
#component
def get_input_data() -> str:
# getting data from API, save to Cloud Storage
# return GS URI
gcs_batch_input_path = 'gs://somebucket/file'
return gcs_batch_input_path
#component(
base_image="python:3.9",
packages_to_install=['google-cloud-aiplatform==1.8.0']
)
def load_ml_model(project_id: str, model: Output[Artifact]):
"""Load existing Vertex model"""
import google.cloud.aiplatform as aip
model_id = '1234'
model = aip.Model(model_name=model_id, project=project_id, location='us-central1')
#dsl.pipeline(
name="batch-pipeline", pipeline_root=PIPELINE_ROOT,
)
def pipeline(gcp_project: str):
input_data = get_input_data()
ml_model = load_ml_model(gcp_project)
gcc_aip.ModelBatchPredictOp(
project=PROJECT_ID,
job_display_name=f'test-prediction',
model=ml_model.output,
gcs_source_uris=[input_data.output], # this doesn't work
# gcs_source_uris=['gs://mybucket/output/'], # hardcoded gs uri works
gcs_destination_output_uri_prefix=f'gs://{PIPELINE_ROOT}/prediction_output/'
)
if __name__ == '__main__':
from kfp.v2 import compiler
import google.cloud.aiplatform as aip
pipeline_export_filepath = 'test-pipeline.json'
compiler.Compiler().compile(pipeline_func=pipeline,
package_path=pipeline_export_filepath)
# pipeline_params = {
# 'gcp_project': PROJECT_ID,
# }
# job = aip.PipelineJob(
# display_name='test-pipeline',
# template_path=pipeline_export_filepath,
# pipeline_root=f'gs://{PIPELINE_ROOT}',
# project=PROJECT_ID,
# parameter_values=pipeline_params,
# )
# job.run()
When running the pipeline it throws this exception when running Batch prediction:
details = "List of found errors: 1.Field: batch_prediction_job.model; Message: Invalid Model resource name.
so I'm not sure what could be wrong. I tried to load model in the notebook (outside of component) and it correctly returns.
Second issue I'm having is referencing GCS URI as output from component to batch job input.
input_data = get_input_data2()
gcc_aip.ModelBatchPredictOp(
project=PROJECT_ID,
job_display_name=f'test-prediction',
model=ml_model.output,
gcs_source_uris=[input_data.output], # this doesn't work
# gcs_source_uris=['gs://mybucket/output/'], # hardcoded gs uri works
gcs_destination_output_uri_prefix=f'gs://{PIPELINE_ROOT}/prediction_output/'
)
During compilation, I get following exception TypeError: Object of type PipelineParam is not JSON serializable, though I think this could be issue of ModelBatchPredictOp component.
Again any help/advice appreciated, I'm dealing with this from yesterday, so maybe I missed something obvious.
libraries I'm using:
google-cloud-aiplatform==1.8.0
google-cloud-pipeline-components==0.2.0
kfp==1.8.10
kfp-pipeline-spec==0.1.13
kfp-server-api==1.7.1
UPDATE
After comments, some research and tuning, for referencing model this works:
#component
def load_ml_model(project_id: str, model: Output[Artifact]):
region = 'us-central1'
model_id = '1234'
model_uid = f'projects/{project_id}/locations/{region}/models/{model_id}'
model.uri = model_uid
model.metadata['resourceName'] = model_uid
and then I can use it as intended:
batch_predict_op = gcc_aip.ModelBatchPredictOp(
project=gcp_project,
job_display_name=f'batch-prediction-test',
model=ml_model.outputs['model'],
gcs_source_uris=[input_batch_gcs_path],
gcs_destination_output_uri_prefix=f'gs://{BUCKET_NAME}/prediction_output/test'
)
UPDATE 2
regarding GCS path, a workaround is to define path outside of the component and pass it as an input parameter, for example (abbreviated):
#dsl.pipeline(
name="my-pipeline",
pipeline_root=PIPELINE_ROOT,
)
def pipeline(
gcp_project: str,
region: str,
bucket: str
):
ts = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
gcs_prediction_input_path = f'gs://{BUCKET_NAME}/prediction_input/video_batch_prediction_input_{ts}.jsonl'
batch_input_data_op = get_input_data(gcs_prediction_input_path) # this loads input data to GCS path
batch_predict_op = gcc_aip.ModelBatchPredictOp(
project=gcp_project,
model=training_job_run_op.outputs["model"],
job_display_name='batch-prediction',
# gcs_source_uris=[batch_input_data_op.output],
gcs_source_uris=[gcs_prediction_input_path],
gcs_destination_output_uri_prefix=f'gs://{BUCKET_NAME}/prediction_output/',
).after(batch_input_data_op) # we need to add 'after' so it runs after input data is prepared since get_input_data doesn't returns anything
still not sure, why it doesn't work/compile when I return GCS path from get_input_data component
I'm glad you solved most of your main issues and found a workaround for model declaration.
For your input.output observation on gcs_source_uris, the reason behind it is because the way the function/class returns the value. If you dig inside the class/methods of google_cloud_pipeline_components you will find that it implements a structure that will allow you to use .outputs from the returned value of the function called.
If you go to the implementation of one of the components of the pipeline you will find that it returns an output array from convert_method_to_component function. So, in order to have that implemented in your custom class/function your function should return a value which can be called as an attribute. Below is a basic implementation of it.
class CustomClass():
def __init__(self):
self.return_val = {'path':'custompath','desc':'a desc'}
#property
def output(self):
return self.return_val
hello = CustomClass()
print(hello.output['path'])
If you want to dig more about it you can go to the following pages:
convert_method_to_component, which is the implementation of convert_method_to_component
Properties, basics of property in python.

I want to get the excel file from the data frame created which automatically changes as written in the code

i have tried two methods and both showing different location as given by me in this image
apikey='abcd'
import pandas as pd
from alpha_vantage.timeseries import TimeSeries
import time
ts=TimeSeries(key=apikey,output_format='pandas')
data,metadata=ts.get_intraday(symbol='name',interval='1min',outputsize='full')
data
while True:
data, metadata=ts.get_intraday(symbol='TCS',interval='1min',outputsize='full')
data.to_excel('livedat.xlsx')
time.sleep(60)
The code is running properly but I don't know how to get the data file in excel.
imp- the method should get the file which is updated timely i.e 1min automaticaly.
Also i am using IBM watson studio to write the code.
I am not familiar with the alpha_vantage wrapper that you are using however this is how i would perform your question. The code works and i have included comments.
To get the file in the python script i would do pd.read_excel(filepath).
import requests
import pandas as pd
import time
import datetime
# Your API KEY and the URL we will request from
API_KEY = "YOUR API KEY"
url = "https://www.alphavantage.co/query?"
def Generate_file(symbol="IBM", interval="1min"):
# URL parameters
parameters = {"function": "TIME_SERIES_INTRADAY",
"symbol": symbol,
"interval": interval,
"apikey": API_KEY,
"outputsize": "compact"}
# get the json response from AlphaVantage
response = requests.get(url, params=parameters)
data = response.json()
# filter the response to only get the time series data we want
time_series_interval = f"Time Series ({interval})"
prices = data[time_series_interval]
# convert the filtered reponse to a Pandas DataFrame
df = pd.DataFrame.from_dict(prices, orient="index").reset_index()
df = df.rename(columns={"index": time_series_interval})
# create a timestampe for our excel file. So that the file does not get overriden with new data each time.
current_time = datetime.datetime.now()
file_timestamp = current_time.strftime("%Y%m%d_%H.%M")
filename = f"livedat_{file_timestamp}.xlsx"
df.to_excel(filename)
# sent a limit on the number of calls we make to prevent infinite loop
call_limit = 3
number_of_calls = 0
while(number_of_calls < call_limit):
Generate_file() # our function
number_of_calls += 1
time.sleep(60)

Tensorflow serving: Unable to base64 decode

I use the slim package resnet_v2_152 to train a classification model.
Then it is exported to .pb file to provide a service.
Because the input is image, so it would be encoded with web-safe base64 encoding. It looks like:
serialized_tf_example = tf.placeholder(dtype=tf.string, name='tf_example')
decoded = tf.decode_base64(serialized_tf_example)
I then encode an image with base64 such that:
img_path = '/Users/wuyanxue/Desktop/not_emoji1.jpeg'
img_b64 = base64.b64encode(open(img_path, 'rb').read())
s = str(img_b64, encoding='utf-8')
s = s.replace('+', '-').replace(r'/', '_')
My post data is as structured as follow:
post_data = {
'signature_name': 'predict',
'instances':[ {
'inputs':
{ 'b64': s }
}]
}
Finally, I post a HTTP request to this server:
res = requests.post('server_address', json=post_data)
It gives me:
'{ "error": "Failed to process element: 0 key: inputs of \\\'instances\\\' list. Error: Invalid argument: Unable to base64 decode" }'
I want to know how it could be encountered? And are there some solutions for that?
I had the same issue when using python3. I solved it by adding a 'b' - a byte-like object instead of the default str to the encode function:
b'{"instances" : [{"b64": "%s"}]}' % base64.b64encode(
dl_request.content)
Hope that helps, please see this answer for extra info.
This question is already solved.
post_data = {
'signature_name': 'predict',
'instances':[ {
'inputs':
{ 'b64': s }
}]
}
We see that inputs is with 'b64' flag, which illustrates that tensorflow serving will decode s with base64 code.
It belongs to the tensorflow serving internal method.
So, the placeholder:
serialized_tf_example = tf.placeholder(dtype=tf.string, name='tf_example')
will directly receive the binary format of the input data BUT NOT base64 format.
So, finally,
decoded = tf.decode_base64(serialized_tf_example)
is NOT necessary.

apollo-upload-client and graphene-django

I have a question about using apollo-upload-client and graphene-django. Here I've discovered that apollo-upload-client adding operations to formData. But here graphene-django is only trying to get query parameter. And the question is, where and how it should be fixed?
If you're referring to the data that has a header like (when viewing the HTTP from Chrome tools):
Content-Disposition: form-data; name="operations"
and data like
{"operationName":"MyMutation","variables":{"myData"....}, "query":"mutation MyMutation"...},
the graphene-python library interprets this and assembles it into a query for you, inserting the variables and removing the file data from the query. If you are using Django, you can find all of the uploaded files in info.context.FILES when writing a mutation.
Here's my solution to support the latest apollo-upload-client (8.1). I recently had to revisit my Django code when I upgraded from apollo-upload-client 5.x to 8.x. Hope this helps.
Sorry I'm using an older graphene-django but hopefully you can update the mutation syntax to the latest.
Upload scalar type (passthrough, basically):
class Upload(Scalar):
'''A file upload'''
#staticmethod
def serialize(value):
raise Exception('File upload cannot be serialized')
#staticmethod
def parse_literal(node):
raise Exception('No such thing as a file upload literal')
#staticmethod
def parse_value(value):
return value
My upload mutation:
class UploadImage(relay.ClientIDMutation):
class Input:
image = graphene.Field(Upload, required=True)
success = graphene.Field(graphene.Boolean)
#classmethod
def mutate_and_get_payload(cls, input, context, info):
with NamedTemporaryFile(delete=False) as tmp:
for chunk in input['image'].chunks():
tmp.write(chunk)
image_file = tmp.name
# do something with image_file
return UploadImage(success=True)
The heavy lifting happens in a custom GraphQL view. Basically it injects the file object into the appropriate places in the variables map.
def maybe_int(s):
try:
return int(s)
except ValueError:
return s
class CustomGraphqlView(GraphQLView):
def parse_request_json(self, json_string):
try:
request_json = json.loads(json_string)
if self.batch:
assert isinstance(request_json,
list), ('Batch requests should receive a list, but received {}.').format(
repr(request_json))
assert len(request_json) > 0, ('Received an empty list in the batch request.')
else:
assert isinstance(request_json, dict), ('The received data is not a valid JSON query.')
return request_json
except AssertionError as e:
raise HttpError(HttpResponseBadRequest(str(e)))
except BaseException:
logger.exception('Invalid JSON')
raise HttpError(HttpResponseBadRequest('POST body sent invalid JSON.'))
def parse_body(self, request):
content_type = self.get_content_type(request)
if content_type == 'application/graphql':
return {'query': request.body.decode()}
elif content_type == 'application/json':
return self.parse_request_json(request.body.decode('utf-8'))
elif content_type in ['application/x-www-form-urlencoded', 'multipart/form-data']:
operations_json = request.POST.get('operations')
map_json = request.POST.get('map')
if operations_json and map_json:
operations = self.parse_request_json(operations_json)
map = self.parse_request_json(map_json)
for file_id, f in request.FILES.items():
for name in map[file_id]:
segments = [maybe_int(s) for s in name.split('.')]
cur = operations
while len(segments) > 1:
cur = cur[segments.pop(0)]
cur[segments.pop(0)] = f
logger.info('parse_body %s', operations)
return operations
else:
return request.POST
return {}

How to pass list/array of request objects to tensorflow serving in one server call?

After loading the wide and deep model, i was able to make prediction for one request object using the map of features and then serializing it to string for predictions as shown below-
is there a way we can create a batch of requests objects and send them for prediction to tensorflow server?
Code for single prediction looks like this-
for (each feature in feature list) {
Feature feature = null;
feature = Feature.newBuilder().setBytesList(BytesList.newBuilder().addValue(ByteString.copyFromUtf8("dummy string"))).build();
if (feature != null) {
inputFeatureMap.put(name, feature);
}
}
//Converting features(in inputFeatureMap) corresponding to one request into 'Features' Proto object
Features features = Features.newBuilder().putAllFeature(inputFeatureMap).build();
inputStr = Example.newBuilder().setFeatures(features).build().toByteString();
}
TensorProto proto = TensorProto.newBuilder()
.addStringVal(inputStr)
.setTensorShape(TensorShapeProto.newBuilder().addDim(TensorShapeProto.Dim.newBuilder().setSize(1).build()).build())
.setDtype(DataType.DT_STRING)
.build();
PredictRequest req = PredictRequest.newBuilder()
.setModelSpec(ModelSpec.newBuilder()
.setName("your serving model name")
.setSignatureName("serving_default")
.setVersion(Int64Value.newBuilder().setValue(modelVer)))
.putAllInputs(ImmutableMap.of("inputs", proto))
.build();
PredictResponse response = stub.predict(req);
System.out.println(response.getOutputsMap());
Is there a way we can send the list of Features Object for predictions, something similar to this-
List<Features> = {someway to create array/list of inputFeatureMap's which can be converted to serialized string.}
For anyone stumbling here, I found a simple workaround with Example proto to do batch request. I will borrow code from this question and modify it for the batch.
Features features =
Features.newBuilder()
.putFeature("Attribute1", feature("A12"))
.putFeature("Attribute2", feature(12))
.putFeature("Attribute3", feature("A32"))
.putFeature("Attribute4", feature("A40"))
.putFeature("Attribute5", feature(7472))
.putFeature("Attribute6", feature("A65"))
.putFeature("Attribute7", feature("A71"))
.putFeature("Attribute8", feature(1))
.putFeature("Attribute9", feature("A92"))
.putFeature("Attribute10", feature("A101"))
.putFeature("Attribute11", feature(2))
.putFeature("Attribute12", feature("A121"))
.putFeature("Attribute13", feature(24))
.putFeature("Attribute14", feature("A143"))
.putFeature("Attribute15", feature("A151"))
.putFeature("Attribute16", feature(1))
.putFeature("Attribute17", feature("A171"))
.putFeature("Attribute18", feature(1))
.putFeature("Attribute19", feature("A191"))
.putFeature("Attribute20", feature("A201"))
.build();
Example example = Example.newBuilder().setFeatures(features).build();
String pfad = System.getProperty("user.dir") + "\\1511523781";
try (SavedModelBundle model = SavedModelBundle.load(pfad, "serve")) {
Session session = model.session();
final String xName = "input_example_tensor";
final String scoresName = "dnn/head/predictions/probabilities:0";
try (Tensor<String> inputBatch = Tensors.create(new byte[][] {example.toByteArray(), example.toByteArray(), example.toByteArray(), example.toByteArray()});
Tensor<Float> output =
session
.runner()
.feed(xName, inputBatch)
.fetch(scoresName)
.run()
.get(0)
.expect(Float.class)) {
System.out.println(Arrays.deepToString(output.copyTo(new float[4][2])));
}
}
Essentially you can pass each example as an object in byte[4][] and you will receive the result in the same shape float[4][2]