How to understand the output key of tflite file - tensorflow

I am learning and running the tensorflow Pi Camera example, but don't know how to understand the output key of tflite file? such as the key "quantization" of https://github.com/tensorflow/examples/blob/master/lite/examples/image_classification/raspberry_pi/classify_picamera.py#L52 , any document ?

You can look this article about quantization
The fundamental idea behind quantization is that if we convert the weights and inputs into integer types, we consume less memory and on certain hardware, the calculations are faster.
And I'm not sure what you are asking about the output file. But you can process your tflite file inside this git repo to understand what input your model needs and what will be the output

You could refer to the description for post-training quantization https://www.tensorflow.org/lite/performance/post_training_quantization#representation_for_quantized_tensors and quantization specification https://www.tensorflow.org/lite/performance/quantization_spec.

Related

Tensorflow's quantization for RNN and LSTM

In the guide for Quantization Aware Training, I noticed that RNN and LSTM were listed in the roadmap for "future support". Does anyone know if it is supported now?
Is using Post-Training Quantization also possible for quantizing RNN and LSTM? I don't see much information or discussion about it so I wonder if it is possible now or if it is still in development.
Thank you.
I am currently trying to implement a speech enhancement model in 8-bit integer based on DTLN (https://github.com/breizhn/DTLN). However, when I tried to infer the quantized model without any audio/ empty array, it adds a weird waveform on top of the result: A constant signal every 125 Hz. I have checked other places in the code and there is no problem, just boils down to the quantization process with RNN/LSTM.

is it possible to convert tflite to tf model?

I converted the TF model, which was a float, into a Tflite model, which was an integer, so that I could make inferences on the Edge device. Tflite is a lightweight model, and it is simple to deploy. Tflite, on the other hand, has a few functions and input allocations that are different from those of TF. Consequently, I would like to revert to using TF. If there is anyone who has any insight into this matter. Leave your thoughts in the comments.
Thanks.
Because there are some information lost during the conversion (e.g. due to several optimization steps, etc.), there's no defined way to convert this back. In case you want to revert flatbuffer (.tflite) back to the fozen graph (.pb), you can refer to this Converting .tflite to .pb.

Quantization of Bert Classifier Model

I am currently trying to quantize a bert-classifier model but am running into an error, I was wondering if this is even supported at the moment or not? For clarity I am asking if quantization is supported on the BERT Classifier super class in the tensorflow-model-garden? Thanks in advance for the help!
Quantizing the standard BERT classifier is probably not a good way to go, if you are interesting in running a BERT-like model on a resource constrained edge device (like a mobile phone). For your specific question, I believe the answer is 'no, quantization of the standard BERT is not supported.' However, a better answer is probably to use one of the smaller BERT-type models that have been created for the edge use case, such as MobileBERT:
https://github.com/google-research/google-research/tree/master/mobilebert
The above link includes scripts for fine-tuning and then converting to TF Lite format in order to run on device.

"Model not quantized" even after post-training quantization

I downloaded a tensorflow model from Custom Vision and want to run it on a coral tpu. I therefore converted it to tensorflow-lite and applying hybrid post-training quantization (as far as I know that's the only way because I do not have access to the training data).
You can see the code here: https://colab.research.google.com/drive/1uc2-Yb9Ths6lEPw6ngRpfdLAgBHMxICk
When I then try to compile it for the edge tpu, I get the following:
Edge TPU Compiler version 2.0.258810407
INFO: Initialized TensorFlow Lite runtime.
Invalid model: model.tflite
Model not quantized
Any idea what my problem might be?
tflite models are not fully quantized using converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]. You might have a look on post training full integer quantization using the representation dataset: https://www.tensorflow.org/lite/performance/post_training_quantization#full_integer_quantization_of_weights_and_activations Simply adapt your generator function to yield representative samples (e.g. similar images, to what your image classification network should predict). Very few images are enough for the converter to identify min and max values and quantize your model. However, typically your accuracy is less in comparison to quantization aware learning.
I can't find the source but I believe the edge TPU currently only supports 8bit-quantized models, and no hybrid operators.
EDIT: On Corals FAQ they mention that the model needs to be fully quantized.
You need to convert your model to TensorFlow Lite and it must be
quantized using either quantization-aware training (recommended) or
full integer post-training quantization.

How to use SqueezeNet in CNTK?

I am a CNTK user. I use AlexNet but would like a more compact NN -- so SqueezeNet seems to be of interest. Or does someone have some other suggestion? How do CNTK users deploy when size matters? Does somebody have a CNTK implementation of SqueezeNet?
The new ONNX model format now has several pretrained vision models, including one for SqueezeNet. You can download the model and load it into CNTK:
import cntk as C
z = C.Function.load(<path of your ONNX model>, format=C.ModelFormat.ONNX)
You can find tutorials for importing/exporting ONNX models in CNTK here.
SqueezeNet is a good choice for a small network with the possibility of a good accuracy. Take a look at DSDSqueezeNet for even better accuracy.
However, if it does not need to be as small as SqueezeNet you also could take a look at MobileNet or NasNet Mobile. These networks may be bigger, but they provide state of the art performance in the task of image classification.
Unfortunately, I do not have an CNTK implementation of SqueezeNet, but maybe a pretrained CNTK model, which you can reuse and finetune using Transfer Learning is all what you are looking for. In this case I can recommend you MMdnn, a conversion tool, which allows to convert a existing pretrained Caffe network to the CNTK model format. In this issue you can find a step by step guide for SqueezeNet.
I would not know of a method for especially small deployment, but you have basically two choices when it comes to saving your model: The standard CNTK model format and the new ONNX format which CNTK is going to support or does already. Till now, I could not try it myself, but maybe it offers a smaller size for the same network.
Since the CNTK model format is already saving the model in binary, I would not expect high improvements anyway, for any format. Anyway, compressing the model definitively could be an option, if size is very important.