Can I customize Tensorflow Serving? - tensorflow-serving

I am studying Tensorflow Serving. I am not familiar with Tensorflow and have many difficulties, but I am studying through google documents or other documents.
For example, after downloading the Tensorflow Serving source file and then compiling it
tensorflow_model_server --port = 9000 --model_name = mnist --model_base_path = / tmp / mnist_model
will work normally and communicate with clients using gRPC.
However, should I use tensorflow-serving only with binary files already provided by Google like tensorflow_model_server?
Or can I include the header in C ++ and add it to the library so that I can write the program arbitrarily?

For serving, you can either use the tensorflow serving C++ API, here is some code example.
Additionally, Google also provide a docker image that serve the models and expose client API in RESTful and gRPC style, so that you can write a client in any languages.
The tensorflow_model_server is part of that Dockerized server, and you will need to write your client to interact with it, here are some code examples to make RESTful or gRPC calls to the server.

Related

While integrating sagemaker endpoint with datadog is there any way we dont use cloudwatch logs

I am trying to integrate SageMaker endpoint which uses inbuilt sagemaker container with datadog. So if i want to integrate it with datadog is there a way to not use cloudwatch and still get the logs or metrics. I found ddtrace can help with it but the issue is since using ddtrace requires to run gunicorn using ddtrace but for that i would need to build a container but that wont be possible since I am using inbuilt container. Is it possible to use ddtrace somehow? Majorly if I want to use tensorflow serving (AWS)container with datadog how can I use it without using cloudwatch logs
I have followed this tutorial but I want to know how can I use it with inbuilt containerhttps://medium.com/tech-shift-com/monitoring-amazon-sagemaker-endpointwith-datadog-ae40dd2fab05
You can customize the prebuilt images. Follow the Sagemaker docs for extending a prebuilt container. In step 2, they demonstrate how to define the base image you are extending. Once you have that, I think it should be as easy as adding RUN pip install ddtrace to your dockerfile.
You might also try adding a sagemaker requirements ENV var when defining the model if that works for you.

export_inference_graph.py vs export_tflite_ssd_graph.py

The output of export_inference_graph.py is
- model.ckpt.data-00000-of-00001
- model.ckpt.info
- model.ckpt.meta
- frozen_inference_graph.pb
+ saved_model (a directory)
while the output of export_tflite_ssd_graph.py
- tflite_graph.pbtxt
- tflite_graph.pb
What is difference in both the frozen files?
I assume you are trying to use your object detection model on mobile devices. For which you need to convert your model to tflite version.
But, you cannot convert models like fasterRCNN to tflite. You need to go for SSD models to be used for mobile devices.
Another way to use model like fasterRCNN in your deployment is,
Use AWS EC2 tensorflow AMI, deploy your model on cloud and have it routed to your website domain or mobile device. When server gets an image through http form that user fills, model will process it on your cloud server and send it back to your required terminal.

Hot load of models into tensorflow serving container

I know how to load a model into a container and also I know that we can create a static config file and when we run a tensorflow serving container pass it to the container and later use one the models inside that config files but I want to know if there is any way to hot load a completely new model (not a newer version of the previous model) into a running tensorflow serving container. What I mean is we run the container with model-A and later we load model-B into the container and use it, can we do this? If yes how?
You can.
First you need to copy the new model files to model_base_path you specified when launching the tf serve, so that the server can see the new model. The directory layout is usually this: $MODEL_BASE_PATH/$model_a/$version_a/* and $MODEL_BASE_PATH/$model_b/$version_b/*
Then you need to refresh the tf serve with a new model_config_file that includes the entry for the new model. See here on how to add entries to the model config file. To make the server take in the new config, there are two ways to do it:
save the new config file and restart the tf serve.
reload the new model config on the fly without restarting the tf serve. This service is defined in model_service.proto as HandleReloadConfigRequest, but the service's REST api does not seem to support it, so you need to rely on the gRPC API. Sadly the Python client for gRPC seems unimplemented. I managed to generate Java client code from protobuf files, but it is quite complex. An example here explains how to generate Java client code for doing gRPC inferencing, and doing handleReloadConfigRequest() is very similar.

How do I add GCS credentials to tensorflow?

I'm trying to train a model on kaggle and dump tensorboard logs into a GCS bucket. I'm hesitant to allow anonymous read/write on my project and would like to be able to have tensorflow use a custom service account with limited quotas for all GCP / gfile.GFile operations. Is there anyway to provide tensorflow with a service account json to use?
Is my best bet just security by obscurity?
I am not experienced using Kraggle and I do not really understand what limits do you want to apply on the service account, but you can follow the next steps to determine a service account access for Google Cloud Storage while using TensorFlow:
Follow this guide to implement GCS custom FileSystem in Tensorflow.
Check the Python client library to instantiate the client.
The service account permissions required for storage are listed here.
To grant roles to a service account, follow this guide.
Check the snippet in Federico's post here, based on this documentation, to implement the service account in your Python code.
Snippet:
from google.oauth2 import service_account
SERVICE_ACCOUNT_FILE = 'service.json'
credentials = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE)
If you have service account credentials in a json file, you can specify it in the GOOGLE_APPLICATION_CREDENTIALS environment variable to have TensorFlow be able to read/write to GCS via gs:// urls.
You can test it out in the following way, by running the following in bash (it downloads a smoke test script from TensorFlow's repository and runs it on your bucket url with your credentials):
wget https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/tools/gcs_test/python/gcs_smoke.py
GOOGLE_APPLICATION_CREDENTIALS=my_credentials.json python gcs_smoke.py --gcs_bucket_url=gs://my_bucket/test_tf
This should create some dummy records in GCS and read from them. After this, you'd want to clean up the remaining temporary outputs to avoid further charges:
gsutil rm -r gs://my_bucket/test_tf

Access model version number in client when the model server loads the latest model based on incremental model numbers

I am serving two different models using the same model server via the model config file in tensorflow serving (1.3.0). Since the model version policy is set to "latest" by default, as soon as I upload a new version of a model, it gets loaded into the model server and the client can use it all fine. However, I would like my client to be aware of which version of the model it is serving. How can I propagate the model version number (which is the directory name) of the model to the client from the server? My model server code is similar to main.cc and client code is inspired from this example provided in tensorflow serving repository.