I have some models running in AI-Platforms under GCP which are serving predictions without a problem.
Now I am trying to automate this deployment process using kubernets pipelines so that model version gets updated periodically. I tried to create some pipelines using available samples but non of these are for AI platforms.
The training of the model has been handled by AI-Platform Jobs with following parameters:
Python: 3.7
Framework: Tensorflow
Framework version: 2.1
ML runtime version: 2.1
Trained model are being created parodically and are being saved in buckets.
How can I automate this deployment process using pipelines.
If there is another alternative approach for this automation, I would like to try it as well.
Related
I was able to run training on mnist with tensorflow in multi server environment
https://docs.databricks.com/_static/notebooks/deep-learning/spark-tensorflow-distributor.html
with MirroredStrategyRunner
what is the equivalent way of running inference in a multi server environment via databrick?
I am trying to standardize our deployment workflow for machine vision systems. So we were thinking of the following workflow.
Deployment workflow
So, we want to create the prototype for the same, so we followed the workflow. So, there is no problem with GCP operation whatsoever but when we try to export models, which we train on the vertexAI it will give three models as mentioned in the workflow which is:
SaveModel
TFLite
TFJS
and we try these models to convert into the ONNX model but we failed due to different errors.
SaveModel - Always getting the same error with any parameter which is as follows
Error in savemodel
I tried to track the error and I identified that the model is not loading inside the TensorFlow only which is wired since it is exported from the GCP vertexAI which leverages the power of TensorFlow.
TFLite - Successfully converted but again the problem with the opset of ONNX but with 15 opset it gets successfully converted but then NVIDIA tensorRT ONNXparser doesn't recognize the model during ONNX to TRT conversion.
TFJS - yet not tried.
So we are blocked here due to these problems.
We can run these models exported directly from the vertexAI on the Jetson Nano device but the problem is TF-TRT and TensorFlow is not memory-optimized on the GPU so the system gets frozen after 3 to 4 hours of running.
We try this workflow with google teachable machine once and it workout well all steps are working perfectly fine so I am really confused How I conclude this full workflow since it's working on a teachable machine which is created by Google and not working on vertexAI model which is again developed by same Company.
Or am I doing Something wrong in this workflow?
For the background we are developing this workflow inside C++ framework for the realtime application in industrial environment.
I am not able to find any good reference for Deployment of a PyTorch Model in IBM Watson .
I have created Bert Model from Hugging face transformers library and implemented using Pytorch . Now i need to deploy the PyTorch model in IBM watson for real time prediction .
I have searched a lot but couldn't find any good reference of Deployment Steps to follow .
have you had a look at :
https://github.com/IBM/pytorch-on-watson-studio ?
This code pattern takes you through the steps to create a model (simple handwritten digit recognizer) in Watson Studio with PyTorch.
Log into IBM Watson Studio
Run the Jupyter notebook in Watson Studio
Use PyTorch to download and process the data
Use Watson Machine Learning to train and deploy the model
I want to convert my tensorflow 1.1 based model into tensorflow lite in order to serve the model locally and remotely for a PWA. The official guide only offers Python APIs for 1.11 at the earliest. Command line tools only seem to work starting at 1.7. Is it possible to convert a 1.1 model to tensorflow lite? Has anyone had experience with this?
The tf module is an out-of-the-box pre-trained model using BIDAF. I am having difficulty serving the full tf app on Heroku, which is unable to run it. I would like to try a tf lite app to see if hosting it locally will make it faster, and easier to set up as a PWA.
Tutorial on the github page for Tensorflow object detection API also has information on running the training on Google Cloud Platform.
But I need to run the training on AWS instance. I have the TFRecords files with me. Is there any tutorial etc available for same?Googling doesn't help much.I am new to AWS.
You need to launch an instance which already has Tensorflow installed on it. AWS has prepared AMIs for that.
see here: https://aws.amazon.com/tensorflow/
Then you just upload anything to the instance and run the script.