I know how to load a model into a container and also I know that we can create a static config file and when we run a tensorflow serving container pass it to the container and later use one the models inside that config files but I want to know if there is any way to hot load a completely new model (not a newer version of the previous model) into a running tensorflow serving container. What I mean is we run the container with model-A and later we load model-B into the container and use it, can we do this? If yes how?
You can.
First you need to copy the new model files to model_base_path you specified when launching the tf serve, so that the server can see the new model. The directory layout is usually this: $MODEL_BASE_PATH/$model_a/$version_a/* and $MODEL_BASE_PATH/$model_b/$version_b/*
Then you need to refresh the tf serve with a new model_config_file that includes the entry for the new model. See here on how to add entries to the model config file. To make the server take in the new config, there are two ways to do it:
save the new config file and restart the tf serve.
reload the new model config on the fly without restarting the tf serve. This service is defined in model_service.proto as HandleReloadConfigRequest, but the service's REST api does not seem to support it, so you need to rely on the gRPC API. Sadly the Python client for gRPC seems unimplemented. I managed to generate Java client code from protobuf files, but it is quite complex. An example here explains how to generate Java client code for doing gRPC inferencing, and doing handleReloadConfigRequest() is very similar.
Related
I am trying to integrate SageMaker endpoint which uses inbuilt sagemaker container with datadog. So if i want to integrate it with datadog is there a way to not use cloudwatch and still get the logs or metrics. I found ddtrace can help with it but the issue is since using ddtrace requires to run gunicorn using ddtrace but for that i would need to build a container but that wont be possible since I am using inbuilt container. Is it possible to use ddtrace somehow? Majorly if I want to use tensorflow serving (AWS)container with datadog how can I use it without using cloudwatch logs
I have followed this tutorial but I want to know how can I use it with inbuilt containerhttps://medium.com/tech-shift-com/monitoring-amazon-sagemaker-endpointwith-datadog-ae40dd2fab05
You can customize the prebuilt images. Follow the Sagemaker docs for extending a prebuilt container. In step 2, they demonstrate how to define the base image you are extending. Once you have that, I think it should be as easy as adding RUN pip install ddtrace to your dockerfile.
You might also try adding a sagemaker requirements ENV var when defining the model if that works for you.
I'm experimenting with the tensorflow server and I succeeded to request the half_plus_two model example in the most simple setting (see docs here). By simple setting I mean here embedding the model (more precisely the directory which contains a version subdir and all the model files under this subdir) in my docker container and starting the tensorflow_model_server either with model_name and model_base_path as parameters or with model_config file parameter.
When I try to put model on S3 (private S3 storage, not AWS), the server starts and finds the model as seen in the logs:
I tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc:403] File-
system polling update: Servable:{name: half_plus_two version: 1}; Servable path:
s3://tftest/half_plus_two/1; Polling frequency: 1
The request to the model does not succeed anymore though. The error I get is :
Attempting to use uninitialized value b\n\t [[{{node b/read}}]]
It's like if using S3 does not let enough time to the model to initialize its values. Does anyone know how to solve this problem?
I finally sort it out. The problem was that the S3 content was not correct. It was containing all the files needed (ok) + the directories (not ok). Source of the problem was my copy procedure from GCP to S3. This procedure is based on google.cloud storage client. So when I did:
blobs = storage_client.list_blobs(bucketName, prefix=savedDir)
and looped over the blobs to copy each object in S3, I was also copying directories. Apparently the S3 connector from tensorflow-server was not liking it.
in a current project I have a problem understanding (and configuring) routing within my vue.js app.
Our Setup
We have a setup, where for each Pull Request in our repos a new Snapshot Environment is created. This environment is one namespace within a kubernetes Cluster. All services in a current develop state are deployed with the new "snapshot" version of the service that triggered the CICD pipeline. To have a clear route for each snapshot environment, we use the namespace as part of the URL (https://HOST/NAMESPACE/APP/paths)
Our Problem
As you can see, the URL is highly dynamic, but currently, we could just build the container with the path and be happy. Thats our current setup. Unfortunately, we want the possibility, to deploy each and every container image on every HOST as well as every NAMESPACE, those parts are only known at runtime, not in the CICD Pipeline.
Is there any way to handle such a scenario with vue.js. I have basically every freedom to edit the app as well as the container, but can't change the way we want to host our app. Currently we build the App on the cluster and inject the NAMESPACE, which was the "easiest" way to do this. But if there is any other way, I would love to not have the build and run step together.
Thanks in advance.
I am studying Tensorflow Serving. I am not familiar with Tensorflow and have many difficulties, but I am studying through google documents or other documents.
For example, after downloading the Tensorflow Serving source file and then compiling it
tensorflow_model_server --port = 9000 --model_name = mnist --model_base_path = / tmp / mnist_model
will work normally and communicate with clients using gRPC.
However, should I use tensorflow-serving only with binary files already provided by Google like tensorflow_model_server?
Or can I include the header in C ++ and add it to the library so that I can write the program arbitrarily?
For serving, you can either use the tensorflow serving C++ API, here is some code example.
Additionally, Google also provide a docker image that serve the models and expose client API in RESTful and gRPC style, so that you can write a client in any languages.
The tensorflow_model_server is part of that Dockerized server, and you will need to write your client to interact with it, here are some code examples to make RESTful or gRPC calls to the server.
I am serving two different models using the same model server via the model config file in tensorflow serving (1.3.0). Since the model version policy is set to "latest" by default, as soon as I upload a new version of a model, it gets loaded into the model server and the client can use it all fine. However, I would like my client to be aware of which version of the model it is serving. How can I propagate the model version number (which is the directory name) of the model to the client from the server? My model server code is similar to main.cc and client code is inspired from this example provided in tensorflow serving repository.