Quantization aware (re)training a keras model - tensorflow

I have a (trained) tf.keras model which I would like to convert to a quantized model and retrain with tensorflow's fake quant strategy (using python as frontend).
Could I somehow apply tf.contrib.quantize.create_training_graph directly to the keras model (graph) then retrain?
Seems like there's some problem with the fact that the session is already created when taking the graph from K.get_session().graph.
For example, the following approach:
import tensorflow.contrib.lite as tflite
keras_graph = tf.keras.backend.get_session().graph
from tensorflow.contrib.quantize import create_training_graph
create_training_graph(input_graph=keras_graph,
quant_delay=int(0*(len(X_train) / batch_size)))
...
model.compile(...)
model.fit_generator(...)
results with the message:
"Operation '{name:'act_softmax/sub' id:2294 op device:{} def:{{{node act_softmax/sub}} = Sub[T=DT_FLOAT](conv_preds/act_quant/FakeQuantWithMinMaxVars:0, act_softmax/Max)}}' was changed by updating input tensor after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session."
And true enough, the error:
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value conv_preds/act_quant/conv_preds/act_quant/max/biased
(i.e. create_training_graph needs the graph before the session was created? is it possible to get the graph from a keras model before the session was instantiated?)
Alternatively, if this doesn't work, could I convert the (h5) model to a checkpoint, then somehow load the model from this checkpoint to a tensorflow graph and continue working with pure tensorflow?
Would appreciate any help or pointers.
Thank you!

Related

Use Tensorflow2 saved model for object detection

im quite new to object detection but i managed to train my first Tensorflow custom model yesterday. I think it worked fine besides some warnings, at least i got my exported_model folder with checkpoint, saved model and pipeline.config. I built it with exporter_main_v2.py from Tensorflow. I just loaded some images of deers and want to try to detect some on different pictures.
That's what i would like to test now, but i dont know how. I already did an object detection tutorial with pre trained models and it worked fine. I tried to just replace config_file_path, saved_model_path and image_path with the paths linking to my exported model but it didnt work:
error: OpenCV(4.6.0) D:\a\opencv-python\opencv-python\opencv\modules\dnn\src\tensorflow\tf_io.cpp:42: error: (-2:Unspecified error) FAILED: ReadProtoFromBinaryFile(param_file, param). Failed to parse GraphDef file: D:\VSCode\Machine_Learning_Tests\Tensorflow\workspace\exported_models\first_model\saved_model\saved_model.pb in function 'cv::dnn::ReadTFNetParamsFromBinaryFileOrDie'
There are endless tutorials on how to train custom detection but i cant find a good explanation how to manually test my exported model.
Thanks in advance!
EDIT: I need to know how to build a script where i can import a model i saved with Tensorflow exporter_main_v2.py and an image i want to test this model on and get a result, either in text or with rectangels in picture. Seeing many tutorials but none works for me with a model i saved with Tensorflow exporter_main_v2.py
From the error it looks like you have a model saved as .pb. If you want to do inference you can write something like this:
# load the model
model = tf.keras.models.load_model(my_model_dir)
prediction = model.predict(x=x_test, ...)
You'll have to set x which is the only mandatory argument. It is your test dataset (the images you want to obtain predictions from). Also, predict is useful when you have a great amount of images to predict. It handles the prediction in a batched way, avoiding filling up the memory. If you have just a few you can use directly the __call__() method of your model, like this:
prediction = model(x_test, training=False)
More about prediction can be found at the Tensorflow documentation.

create_training_graph() failed when converted MobileFacenet to quantize-aware model with TF-lite

I am trying to quantize MobileFacenet (code from sirius-ai) according to the suggestion
and I think I met the same issue as this one
When I add tf.contrib.quantize.create_training_graph() into training graph
(train_nets.py ln.187: before train_op = train(...) or in train() utils/common.py ln.38 before gradients)
It did not add quantize-aware ops into the graph to collect dynamic range max\min.
I assume that I should see some additional nodes in tensorboard, but I did not, thus I think I did not successfully add quantize-aware ops in training graph.
And I try to trace tensorflow, found that I got nothing with _FindLayersToQuantize().
However when I add tf.contrib.quantize.create_eval_graph() to refine the training graph. I can see some quantize-aware ops as act_quant...
Since I did not add ops in training graph successfully, I have no weights to load in eval graph.
Thus I got some error message as
Key MobileFaceNet/Logits/LinearConv1x1/act_quant/max not found in checkpoint
or
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value MobileFaceNet/Logits/LinearConv1x1/act_quant/max
Does anyone know how to fix this error? or how to get quantized MobileFacenet with good accuracy?
Thanks!
H,
Unfortunately, the contrib/quantize tool is now deprecated. It won't be able to support newer models, and we are not working on it anymore.
If you are interested in QAT, I would recommend trying the new TF/Keras QAT API. We are actively developing that and providing support for it.

Using model optimizer for tensorflow slim models

I am aiming to inference tensorflow slim model with Intel OpenVINO optimizer. Using open vino docs and slides for inference and tf slim docs for training model.
It's a multi-class classification problem. I have trained tf slim mobilnet_v2 model from scratch (using sript train_image_classifier.py). Evaluation of trained model on test set gives relatively good results to begin with (using script eval_image_classifier.py):
eval/Accuracy[0.8017]eval/Recall_5[0.9993]
However, single .ckpt file is not saved (even though at the end of train_image_classifier.py run there is a message like "model.ckpt is saved to checkpoint_dir"), there are 3 files (.ckpt-180000.data-00000-of-00001, .ckpt-180000.index, .ckpt-180000.meta) instead.
OpenVINO model optimizer requires a single checkpoint file.
According to docs I call mo_tf.py with following params:
python mo_tf.py --input_model D:/model/mobilenet_v2_224.pb --input_checkpoint D:/model/model.ckpt-180000 -b 1
It gives the error (same if pass --input_checkpoint D:/model/model.ckpt):
[ ERROR ] The value for command line parameter "input_checkpoint" must be existing file/directory, but "D:/model/model.ckpt-180000" does not exist.
Error message is clear, there are not such files on disk. But as I know most tf utilities convert .ckpt-????.meta to .ckpt under the hood.
Trying to call:
python mo_tf.py --input_model D:/model/mobilenet_v2_224.pb --input_meta_graph D:/model/model.ckpt-180000.meta -b 1
Causes:
[ ERROR ] Unknown configuration of input model parameters
It doesn't matter for me in which way I will transfer graph to OpenVINO intermediate representation, just need to reach that result.
Thanks a lot.
EDIT
I managed to run OpenVINO model optimizer on frozen graph of tf slim model. However I still have no idea why had my previous attempts (based on docs) failed.
you can try converting the model to frozen format (.pb) and then convert the model using OpenVINO.
.ckpt-meta has the metagraph. The computation graph structure without variable values.
the one you can observe in tensorboard.
.ckpt-data has the variable values,without the skeleton or structure. to restore a model we need both meta and data files.
.pb file saves the whole graph (meta+data)
As per the documentation of OpenVINO:
When a network is defined in Python* code, you have to create an inference graph file. Usually, graphs are built in a form that allows model training. That means that all trainable parameters are represented as variables in the graph. To use the graph with the Model Optimizer, it should be frozen.
https://software.intel.com/en-us/articles/OpenVINO-Using-TensorFlow
the OpenVINO optimizes the model by converting the weighted graph passed in frozen form.

NotImplementedError: Operation of type AssignVariableOp (AssignVariableOp) is not supported on the TPU for inference

I'm trying to export SavedModel of a classifier of the type TPUEstimator. Since I'm trying to export the model to run predictions on a GPU/CPU, the use_tpu parameter of TPUEstimator was set to False.
When I try to save the model, the following error is thrown:
NotImplementedError: Operation of type AssignVariableOp
(AssignVariableOp) is not supported on the TPU for inference. Execution
will fail if this op is used in the graph. Make sure your variables are
using variable_scope.
Since I plan to serve the model through a GPU/CPU, the Op shouldn't be a problem. How can I export this estimator as SavedModel?
This might help, just before calling export_savedmodel(...) call
estimator._export_to_tpu = False
If you actually don't need TPU inference support you can create a tf.estimator.Estimator instead of a tf.contrib.tpu.TPUEstimator one, using the same model_fn and trained model. Then, you should be able to export the model.

Error with 8-bit Quantization in Tensorflow

I have been experimenting with the new 8-bit quantization feature available in TensorFlow. I could run the example given in the blog post (quantization of googlenet) without any issue and it works fine for me !!!
Now, I would like to apply the same for a simpler network. So I used a pre-trained network for CIFAR-10 (which is trained on Caffe), extracted its parameters, created corresponding graph in tensorflow, initialized the weights with this pre-trained weights and finally saved it as a GraphDef object. See this IPython Notebook for full procedure.
Now I applied the 8-bit quantization with the tensorflow script as mentioned in the Pete Warden's blog:
bazel-bin/tensorflow/contrib/quantization/tools/quantize_graph --input=cifar.pb --output=qcifar.pb --mode=eightbit --bitdepth=8 --output_node_names="ArgMax"
Now I wanted to run the classification on this quantized network. So I loaded the new qcifar.pb to a tensorflow session and passed the image (the same way I passed it to original version). Full code can be found in this IPython Notebook.
But as you can see at the end, I am getting following error:
NotFoundError: Op type not registered 'QuantizeV2'
Can anybody suggest what am I missing here?
Because the quantized ops and kernels are in contrib, you'll need to explicitly load them in your python script. There's an example of that in the quantize_graph.py script itself:
from tensorflow.contrib.quantization import load_quantized_ops_so
from tensorflow.contrib.quantization.kernels import load_quantized_kernels_so
This is something that we should update the documentation to mention!