How to deploy a live learning tensor flow model in cloud? - tensorflow

How do I deploy a tensor flow model in cloud which can learn and update the weights when given as input . Since most of the deployment methods I saw involved model freezing which implied freezing of weights also . Is it possible or is the latter the only way ?

Freezing the model is the most compact form and lets you have a smaller inference node which you can call for just prediction and only has the necessary information to do just that.
If you want to have and model and make it available to learn online and also make inference you could have so it has all the graph loaded with the newest weights. For security save the weights from time to time. Of course you could have two programs one for inference with the latest frozen model and another one that you up from time to time to make a new training, using the last saved weights. I recommend you the second option. Hope it helps!

Related

Tensorflow Object Detection API - What's actually test.record being used for?

I have a few doubts about Tensorflow Object Detection API. Hopefully someone can help me out... Before that, I need to mention that I am following what sendex is doing. So basically, the steps are come from him.
First doubt: Why we need test.record for training? What it does during training?
Second doubt: Sendex is getting images from test.record to test the newly trained model, doesn't the model already knew that images because they are from test.record?
Third doubt: In what type of occasion we need to activate drop_out (in the .config file)?
1) It does nothing during training, you dont need that during training, but at certain time the model begins to overfit. It means the loss on training images continues to go down but the accuracy on testing images stops improving and begins to decline. This is the time when it is needed to stop traininga nd to recognise this moment you need the test.record.
2) Images were used only to evaluate model during training not to train the net.
3) You do not need to activate it, but using dropout you usually achieve higher accuracy. It prevents the net from overfitting.

What does freezing a graph in TensorFlow mean?

I am a beginner in NN APIs and TensorFlow.
I am trying to save my trained model in protobuff format (.pb), there are many blogs explaining how to save the model as protobuff. One thing I did not understand is what is the importance of freezing the graph before saving it as protobuff? I read that freezing coverts variable to constants, does that mean the model is not trainable anymore?
What else will freezing do on models?
What is that model loses after freezing?
Can anyone please explain or give some pointers on details of freezing?
This is only a partial answer to your question.
A freezed graph is easily optimizable. When doing inference (forward propagation) for instance you can fuse some of the layers together. This you can't do with a graph separated between variables and operations (a not frozen graph). Why would you want to fuse layers together? There are multiple reasons. Going hardware specific: it might be easier to compute a number of operations together in a group of tensors, specific to the structure of your cpu or gpu. TensorRT is a graph optimizer for instance that works starting from a frozen graph (here more info on graph optimizations done by tensorRT: https://devblogs.nvidia.com/tensorrt-integration-speeds-tensorflow-inference/ ). This software does graph optimizations as well as hardware specific ones.
As far as I understand you can unfreeze a graph. I have only worked optimizing them, so I haven't use this feature. But here there is code for it: https://gist.github.com/tokestermw/795cc1fd6d0c9069b20204cbd133e36b
Here is another question that might be useful:
TensorFlow: Is there a way to convert a frozen graph into a checkpoint model?
It has not yet been answered though.
Freezing the model means producing a singular file containing information about the graph and checkpoint variables, but saving these hyperparameters as constants within the graph structure. This eliminates additional information saved in the checkpoint files such as the gradients at each point, which are included so that the model can be reloaded and training continued from where you left off. As this is not needed when serving a model purely for inference they are discarded in freezing. A frozen model is a file of the Google .pb file type.

How to run inference on inception v3 trained models?

I've successfully trained the inception v3 model on custom 200 classes from scratch. Now I have ckpt files in my output dir. How to use those models to run inference?
Preferably, load the model on GPU and pass images whenever I want while the model persists on GPU. Using TensorFlow serving is not an option for me.
Note: I've tried to freeze these models but failed to correctly put output_nodes while freezing. Used ImagenetV3/Predictions/Softmax but couldn't use it with feed_dict as I couldn't get required tensors from freezed model.
There is poor documentation on TF site & repo on this inference part.
It sounds like you're on the right track, you don't really do anything different at inference time as you do at training time except that you don't ask it to compute the optimizer at inference time, and by not doing so, no weights are ever updated.
The save and restore guide in tensorflow documentation explains how to restore a model from checkpoint:
https://www.tensorflow.org/programmers_guide/saved_model
You have two options when restoring a model, either you build the OPS again from code (usually a build_graph() method) then load the variables in from the checkpoint, I use this method most commonly. Or you can load the graph definition & variables in from the checkpoint if the graph definition was saved with the checkpoint.
Once you've loaded the graph you'll create a session and ask the graph to compute just the output. The tensor ImagenetV3/Predictions/Softmax looks right to me (I'm not immediately familiar with the particular model you're working with). You will need to pass in the appropriate inputs, your images, and possibly whatever parameters the graph requires, sometimes an is_train boolean is needed, and other such details.
Since you aren't asking tensorflow to compute the optimizer operation no weights will be updated. There's really no difference between training and inference other than what operations you request the graph to compute.
Tensorflow will use the GPU by default just as it did with training, so all of that is pretty much handled behind the scenes for you.

Deep Learning with TensorFlow on Compute Engine VM

I'm actualy new in Machine Learning, but this theme is vary interesting for me, so Im using TensorFlow to classify some images from MNIST datasets...I run this code on Compute Engine(VM) at Google Cloud, because my computer is to weak for this. And the code actualy run well, but the problam is that when I each time enter to my VM and run the same code I need to wait while my model is training on CNN, and after I can make some tests or experiment with my data to plot or import some external images to impruve my accuracy etc.
Is There is some way to save my result of trainin model just once, some where, that when I will decide for example to enter to the same VM tomorrow...and dont wait anymore while my model is training. Is that possible to do this ?
Or there is maybe some another way to do something similar ?
You can save a trained model in TensorFlow and then use it later by loading it; that way you only have to train your model once, and use it as many times as you want. To do that, you can follow the TensorFlow documentation regarding that topic, where you can find information on how to save and load the model. In short, you will have to use the SavedModelBuilder class to define the type and location of your saved model, and then add the MetaGraphs and variables you want to save. Loading the saved model for posterior usage is even easier, as you will only have to run a command pointing to the location of the file in which the model was exported.
On the other hand, I would strongly recommend you to change your working environment in such a way that it can be more profitable for you. In Google Cloud you have the Cloud ML Engine service, which might be good for the type of work you are developing. It allows you to train your models and perform predictions without the need of an instance running all the required software. I happen to have worked a little bit with TensorFlow recently, and at first I was also working with a virtualized instance, but after following some tutorials I was able to save some money by migrating my work to ML Engine, as you are only charged for the usage. If you are using your VM only with that purpose, take a look at it.
You can of course consult all the available documentation, but as a first quickstart, if you are interested in ML Engine, I recommend you to have a look at how to train your models and how to get your predictions.

TensorFlow in production: How to retrain your models

I have a question related to this one:
TensorFlow in production for real time predictions in high traffic app - how to use?
I want to setup TensorFlow Serving to do inference as a service for our other application. I see how TensorFlow Serving helps me to do that. Additionally, it mentions a continuous training pipeline, which probably is related to the possibility that TensorFlow Serving can serve with multiple versions of a trained model. But what I am not sure is how to retrain your model as you get new data. The other post mentions the idea to run retraining with cron jobs. However, I am not sure if automatic retraining is a good idea. What architecture would you propose for a continuous retraining pipeline with a system continuously facing new, labelled data?
Edit: It is a supervised learning case. The question is would you automatically retrain your model after n new datapoints came in or would you retrain during the downtime of the customer automatically or just retrain manually?
You probably want to use some kind of semi-supervised training. There's fairly extensive research in that area.
A crude, but expedient way, which works well, is to use the current best models that you have to label the new, incoming data. Models are typically able to produce a score (hopefully a logprob). You can use that score to only train on the data that fits well.
That is an approach that we have used in speech recognition and is an excellent baseline.