CNN model for deployment: how to optimize - tensorflow

Its my first time deploying a model. I've created a cnn model using tensorflow, keras, Xception and saved model is about 80 mb. When I load it from a function and do a prediction, it takes about 4-5 seconds. Is there a way to reduce this time? Does the model has to be loaded for every prediction?
enter image description here

The model load only once in your program. for each prediction, you use the loaded model. it might take time to predict. TensorFlow doesn't load the model on prediction. the better way is to only save weights after training and for inference create model architecture and then load the saved weights.

Related

Which layers are frozen using Tensorflow 2 Object detection API?

How can I understand which layers are frozen fine-tuning a detection model from Tensorflow Model Zoo 2?
I have already set with success the Path for fine_tune_checkpoint and fine_tune_checkpoint_type: detection and in the file proto I have already read that "detection" means
// 2. "detection": Restores the entire feature extractor.
The only parts of the full detection model that are not restored are the box and class prediction heads.
This option is typically used when you want to use a pre-trained detection model
and train on a new dataset or task which requires different box and class prediction heads.
I didn't really understand what does that means. Restored means Frozen in this context?
As I understand it, currently the Tensorflow 2 Object detection does not freeze any layers when training from a fine tune checkpoint. There is a issue reported here to support specifying which layers to freeze in the pipeline config. If you look at the training step function, you can see that all trainable variables are used when applying gradients during training.
Restored here means that the model weights are copied from the checkpoint to be used as a starting point for training. Frozen would mean that the weights are not changed (i.e. no gradient is applied) during training.

Re-training keras model

I've got keras model traing and I'm using this model to generate data. I want to use that data to re-traing my model. After training this model seems to know how to predict new data, but somehow lost knowledge about previous data. I do not compile model again before training. There is some special actions to perform re-training in keras?

TensorFlow model saving to be approached differently during training Vs. deployment?

Assume that I have a CNN which I am training on some dataset. The most important part of the model is the CNN architecture.
Now when I write a code, I define the model structure in a Python class. However, outside that class, I define a number of other nodes such as loss, accuracy, tf.Variable to keep count of epochs and so on.
When I am training, for properly resuming the training, I'd like to save all these nodes (e.g - loss, epoch variable etc), and not just the CNN structure.
However, once I am done with training, I would like to save only the CNN architecture and no nodes for loss, accuracy etc. This is because it will enable people using my model to exercise freedom in writing their own finetuning codes.
How to achieve this in TF code ? Can someone show an example ?
Is this approach towards saving followed by others also ? I just want to know if my approach is right.

Convert a model trained in CUDNNLSTM so that the trained model can be run on CPU also

I have a model trained in CUDNNLSTM, how can I use this in CPU? How can we export CUDNNLSTM variables to CPU bases weights so that the trained model can be run on CPU also.
One safe way, that always works, would be to simply save the variables to some file with appropriate format like hdf5 and manually load them again. This way you have complete control over what is happening and don't depend on the system you are using. You can even train a model in TensorFlow, save the variables and load them for use with a different library like PyTorch (assuming you defined the exact same model there).

How to run inference on inception v3 trained models?

I've successfully trained the inception v3 model on custom 200 classes from scratch. Now I have ckpt files in my output dir. How to use those models to run inference?
Preferably, load the model on GPU and pass images whenever I want while the model persists on GPU. Using TensorFlow serving is not an option for me.
Note: I've tried to freeze these models but failed to correctly put output_nodes while freezing. Used ImagenetV3/Predictions/Softmax but couldn't use it with feed_dict as I couldn't get required tensors from freezed model.
There is poor documentation on TF site & repo on this inference part.
It sounds like you're on the right track, you don't really do anything different at inference time as you do at training time except that you don't ask it to compute the optimizer at inference time, and by not doing so, no weights are ever updated.
The save and restore guide in tensorflow documentation explains how to restore a model from checkpoint:
https://www.tensorflow.org/programmers_guide/saved_model
You have two options when restoring a model, either you build the OPS again from code (usually a build_graph() method) then load the variables in from the checkpoint, I use this method most commonly. Or you can load the graph definition & variables in from the checkpoint if the graph definition was saved with the checkpoint.
Once you've loaded the graph you'll create a session and ask the graph to compute just the output. The tensor ImagenetV3/Predictions/Softmax looks right to me (I'm not immediately familiar with the particular model you're working with). You will need to pass in the appropriate inputs, your images, and possibly whatever parameters the graph requires, sometimes an is_train boolean is needed, and other such details.
Since you aren't asking tensorflow to compute the optimizer operation no weights will be updated. There's really no difference between training and inference other than what operations you request the graph to compute.
Tensorflow will use the GPU by default just as it did with training, so all of that is pretty much handled behind the scenes for you.