Fused activation functions of conv ops should be NONE but they are RELU6 - tensorflow

I am trying to convert a frozen mobilenet_v1 graph to tflite. I'm interested in the conversion process and not just the quantized tflite model, so I'm comparing my converted model to one already made by tensorflow. The difference between them is that my conv2d nodes have a relu6 activation function while the correct model's conv2d nodes don't have any activation function. How should I go about fixing this problem?
I should also mention that I'm converting to a fully quantized tflite model by using tensorflow's quantization-aware training and activating the quantization flags upon conversion to tflite.

Related

Partially restore weights in TF2

I trained a resnet model whose final layer has 3 outputs (multiclass classification). I want to use these model weights to pretrain a regression model which has the exact same architecture except the last layer, which has 1 output.
This seems like a very basic use case, but I do not see how to do this. Restoring a checkpoint gives an error since the architectures are not the same (mismatched shape). All other solutions I have found are either for TF1 (eg https://innerpeace-wu.github.io/2017/12/13/Tensorflow-Restore-partial-weights/) or using Keras .h5 restore.
How can I do this in TF2?

Cannot load tflite model, Did not get operators or tensors in subgraph 1

I have converted a tf model to tflite, and applied quantization in the process, but I cannot load it. The error was raised when I try to do interpreter = tf.lite.Interpreter(tflite_model_path), the error message was:
ValueError: Did not get operators or tensors in subgraph 1.
Also during quantization, I got lots of these INFO messages for every dense layer in my model:
2021-09-06 04:38:40.879693: I tensorflow/lite/tools/optimize/quantize_weights.cc:217] Skipping quantization of tensor bert_token_clssfification/classifier/Tensordot/Shape that is not type float.
These messages confuse me greatly, because I'm sure those weights are of type float32. Any ideas what I'm doing wrong? Thanks!
I figured out the cause in my case. It was because I have dropout layers in my model, and I'm using an input tf.bool tensor to explicitly control the training/inference mode of the dropouts layers. Dropout is not currently supported in tflite, and because I'm explicitly controlling the dropout behaviour, the tflite conversion cannot remove the dropout operations.
The correct way to use dropout is to pass the training kwarg during model invocation: out = model(input_batch, training=True).

Batch Normalization Quantize Tensorflow 1.x does not have MinMax information

A layer (....) which is an input to the Conv operator producing the output array model/re_lu_1/Relu, is lacking min/max data, which is necessary for quantization. If accuracy matters, either target a non-quantized output format, or run quantized training with your model from a floating point checkpoint to change the input graph to contain min/max information. If you don't care about accuracy, you can pass --default_ranges_min= and --default_ranges_max= for easy experimentation.
For tensorflow 1.x, if you want to quantize, you have to place it with fake quantization nodes to activate the quantization of the model.
There are 3 phases of quantization:
Training part: load your model to graph => create training graph by contrib => train and store weights ckpt
Eval part: load your model to graph without weights => create eval graph => restore graph => export to frozen model
Toco/tflite convert frozen model to quantized model
However, the most important factor is the configuration of batch_normalization in the model. After trying multiple configuration, the best one is using batch_normalization without fused option from tensorflow.keras.layers.
The reason is because Tensorflow want to avoid the folding result to be quantized. Therefore, activation behind batchnorm wont work. Details in [here][1]
In short, this layer should be attached only under tensorflow.keras.layers.Conv2D with parsed activation param, which is Relu/Relu6/Identity
If you conduct the above process: Conv2d=>Activation=>BatchNorm
the layer will not yield errors does not have MinMax information

Deployment of keras layer UpSampling2D to tensorRT

Kears/TensorFlow layer UpSampling2D() cannot be deployed to TensorRT (known behavior).
I am trying to find a solution by replacing the layer UpSampling2D() by other Keras layer with parallel behaviour.
Theoretically Conv2DTranspose() should do the work, by setting specific weights and fixing the weights of the layers in training.
I am looking for some help on how to do that.
I did a test run by replacing all the UpSampling 2D() with Conv2DTranspose() in my model and then converted it to UFF. (I only trained the model for 1 epoch to save time).
The converter then complained about DataFormatVecPermute instead.
Converting conv2d_transpose_1/conv2d_transpose-0-VecPermuteNHWCToNCHW-LayoutOptimizer as custom op: DataFormatVecPermute
Warning: No conversion function registered for layer: DataFormatVecPermute yet.
And the parser in C++ couldn't parse this model successfully either.

Float ops found in quantized TensorFlow MobileNet model

As you can see in the screenshot of a quantized MobileNet model implemented in TensorFlow, there are still some float operations. The quantization is done in TensorFlow via the graph_transform tools.
The red ellipse in the image has its description in the right-hand-size text box. The "depthwise" is a "DepthwiseConv2dNative" operation that expects "DT_FLOAT" inputs.
Despite the lower Relu6 performs an 8-bit quantized operation, the result has to go through "(Relu6)" which is a "Dequantize" op, in order to produce "DT_FLOAT" inputs for the depthwise convolution.
Why is depthwise conv operations left out by TF graph_transform tools? Thank you.
Unfortunately there isn't a quantized version of depthwise conv in standard TensorFlow, so it falls back to the float implementation with conversions before and after. For a full eight-bit implementation of MobileNet, you'll need to look at TensorFlow Lite, which you can learn more about here:
https://www.tensorflow.org/mobile/tflite/