TensorFlow Lite Full-Integer Quantization fails in TF 2 - tensorflow2.0

I've trained resnet50v2 and densenet 169 models. TensorFlow nightly 2.3.0-dev20200608. The model works fine and I tried some optimization such as "simple" tf lite, tf lite dynamic range, tf lite 16float, and they all work fine (the accuracy is either identical to the original or slightly lower as expected).
I want to convert my model to use full-integer post-training quantization with uint8. I converted my model from SavedModel format with:
converter = tf.lite.TFLiteConverter.from_saved_model('/path/to/my/saved_models')
converter.optimizations = [tf.lite.Optimize.DEFAULT]
def representative_dataset_gen():
for i in range(100):
yield [x_train[i].astype(np.float32)]
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()
with open('resnet.tflite', 'wb') as f:
f.write(tflite_model)
as written in the TensorFlow lite website. I then compiled the model for edge tpu. It works, in the sense that edge tpu allows me to run it without errors but results are gibberish. It predicts always the same value. I tried then on cpu with tf lite interpreter. Input/output tensor are correctly uint8, but again it predicts again the same value. On cpu wiht tf lite issue persists moving to int8. Has anyone else experienced the same issue?
Please find here a Google folder with the code I use to convert, the model before and after conversion, and the converted model.
https://drive.google.com/drive/folders/11XruNeJzdIm9DTn7FnuIWYaSalqg2F0B?usp=sharing

Related

Tensorflow 2 SSD MobileNet model breaks during conversion to tflite

I've been trying to follow this process to run an object detector (SSD MobileNet) on the Google Coral Edge TPU:
Edge TPU model workflow
I've successfully trained and evaluated my model with the Object Detection API. I have the model both in checkpoint format as well as tf SavedModel format. As per the documentation, the next step is to convert to .tflite format using post-training quantization.
I am to attempting to follow this example. The export_tflite_graph_tf2.py script and the conversion code that comes after run without errors, but I see some weird behavior when I try to actually use the model to run inference.
I am unable to use the saved_model generated by export_tflite_graph_tf2.py. When running the following code, I get an error:
print('loading model...')
model = tf.saved_model.load(tflite_base)
print('model loaded!')
results = model(image_np)
TypeError: '_UserObject' object is not callable --> results = model(image_np)
As a result, I have no way to tell if the script broke my model or not before I even convert it to tflite. Why would model not be callable in this way? I have even verified that the type returned by tf.saved_model.load() is the same when I pass in a saved_model before it went through the export_tflite_graph_tf2.py script and after. The only possible explanation I can think of is that the script alters the object in some way that causes it to break.
I convert to tflite with post-training quantization with the following code:
def representative_data_gen():
dataset_list = tf.data.Dataset.list_files(images_dir + '/*')
for i in range(100):
image = next(iter(dataset_list))
image = tf.io.read_file(image)
# supports PNG as well
image = tf.io.decode_image(image, channels=3)
image = tf.image.resize(image, [IMAGE_SIZE, IMAGE_SIZE])
image = tf.cast(image / 255., tf.float32)
image = tf.expand_dims(image, 0)
if i == 0:
print(image.dtype)
yield [image]
# This enables quantization
# This sets the representative dataset for quantization
converter = tf.lite.TFLiteConverter.from_saved_model(base_saved_model)
# converter = tf.lite.TFLiteConverter.from_keras(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT] # issue here?
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [
# tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops.
# tf.lite.OpsSet.SELECT_TF_OPS, # enable TensorFlow ops.
tf.lite.OpsSet.TFLITE_BUILTINS_INT8 # This ensures that if any ops can't be quantized, the converter throws an error
]
# This ensures that if any ops can't be quantized, the converter throws an error
# For full integer quantization, though supported types defaults to int8 only, we explicitly declare it for clarity.
converter.target_spec.supported_types = [tf.int8]
converter.target_spec.supported_ops += [tf.lite.OpsSet.TFLITE_BUILTINS]
# These set the input and output tensors to uint8 (added in r2.3)
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model_quantized = converter.convert()
Everything runs with no errors, but when I try to actually run an image through the model, it returns garbage. I tried removing the quantization to see if that was the issue, but even without quantization it returns seemingly random bounding boxes that are completely off from the model's performance prior to conversion. The shape of the output tensors look fine, it's just the content is all wrong.
What's the right way to get this model converted to a quantized tflite form? I should note that I can't use the tflite_convert utility because I need to quantize the model, and it appears according to the source code that the quantize_weights flag is deprecated? There are a bunch of conflicting resources I see from TF1 and TF2 about this conversion process so I'm pretty confused.
Note: I'm using a retrained SSD MobileNet from the model zoo. I have not made any changes to the architecture in my training workflow. I've confirmed that the errors persist even on the base model pulled directly from the object detection model zoo.
I’m have a very similar problem with Post Training Quantization and asked about it on GitHub
I could manage to get results from the TFLite model but they were not good enough. Here is the notebook how I did it. Maybe it helps you to get a step forward.

Converting a Keras model to TensorFlow lite - how to avoid unsupported operations?

I have MobileNetV2 based model that uses the TimeDistributed layer. I want to convert that model to a TensorFlow Lite model in order to run it on a smartphone, but there is on undefined operation.
Here is the code:
import tensorflow as tf
IMAGE_SHAPE = (224, 224, 3)
mobilenet_model = tf.keras.applications.MobileNetV2(input_shape=IMAGE_SHAPE,
include_top=False,
pooling='avg',
weights='imagenet')
inputs = tf.keras.Input(shape=(5,) + IMAGE_SHAPE)
x = tf.keras.applications.mobilenet_v2.preprocess_input(inputs)
outputs = tf.keras.layers.TimeDistributed(mobilenet_model)(x)
model = tf.keras.Model(inputs, outputs)
model.compile()
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tfmodel = converter.convert() # fails
Here is the error message:
error: failed while converting: 'main':
Some ops are not supported by the native TFLite runtime, you can enable TF kernels fallback using TF Select. See instructions: https://www.tensorflow.org/lite/guide/ops_select
TF Select ops: Mul
Details:
tf.Mul(tensor<?x5x224x224x3xf32>, tensor<f32>) -> (tensor<?x5x224x224x3xf32>)
The error is caused by the interaction between the input preprocessing and TimeDistributed layer. If I disable the input preprocessing, then the conversion succeeds, but obviously the network will not work properly without retraining. Also models that have the preprocessing but don't have TimeDistributed layer can be converted. Is it possible to move the preprocessing to a different place to avoid this error?
Also, adding the select ops helps to convert it successfully, but I'm not sure how to enable them at runtime. I'm using the Mediapipe framework to build an Android app. and I don't think Mediapipe supports linking to extra operations.
Try this to conversion of model.
converter = TFLiteConverter.from_saved_model(model_dir)
converter.experimental_new_converter = False
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
converter.allow_custom_ops=True
tflite_model = converter.convert()
Check the documentation on how to convert it and use it inside android. Maybe you will have success with that. If not come back and ping me again.
You can give it a try without:
converter.target_spec.supported_ops = [
tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops.
tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops.
]
if your tensorflow version is 2.6.0 or 2.6.1
Check the allow list operators here
Best

EDGE_TPU COMPILER ERROR: Didn't find op for builtin opcode 'RESIZE_NEAREST_NEIGHBOR' version '3' for custom YOLO

I have retrained my model on Darknet and use https://github.com/qqwweee/keras-yolo3 to convert my darknet weights to h5.
I have replace relu for leaky relu for quantization purpose.
My model then has been converted to tflite model successfully by using tf-nightly.
However, I can't parse the model to edgetpu by resize nearest neighbor error.
To my understanding, resize nearest neighbor is supported in https://coral.ai/docs/edgetpu/models-intro/#supported-operations
So why did this error happened?
Any way to fix?
Here is my tflite convert code:
import tensorflow as tf
import numpy as np
import sys
def representative_dataset_gen():
for _ in range(250):
yield [np.random.uniform(0.0, 1.0, size=(1, 416, 416, 3)).astype(np.float32)]
if len(sys.argv) != 3:
print(f"Usage: {sys.argv[0]} <keras-model> <output-filename>")
sys.exit(1)
model_fn = sys.argv[1]
out_fn = sys.argv[2]
# Convert and apply full integer quantization
converter = tf.compat.v1.lite.TFLiteConverter.from_keras_model_file(model_fn)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8,
tf.lite.OpsSet.SELECT_TF_OPS]
#converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
# Set inputs and outputs of network to 8-bit unsigned integer
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
converter.representative_dataset = representative_dataset_gen
tflite_model = converter.convert()
open(sys.argv[2], "wb").write(tflite_model)
ERROR
I'm from the coral team and I'm planning on investigating the yolov3 model myself, just haven't got the bandwidth to do so yet :)
Here are some tips from what I've gathered:
Users have been able to successfully compile the model after changing the leaky_relu to relu, although accuracy may decrease. I know you've mention this, but I wanted to put this on the list for other users to reference.
Secondly, I suspect that you are using tf-nightly or some newer tf version for your conversion? If so, I suggest downgrading to maybe tf2.2, some newer version of the same ops are not yet supported by the compiler.
Try turning off MLIR converter also, released version of the edgetpu_compiler doesn't play well with MLIR
lmk if you found some success, would love to give this a shot also!
fyi: I got yolov4 converted but the architect only allows 1/962 ops to run on the edgetpu so it's a bummer no go.

Unable to properly convert tf.keras model to quantized format for coral TPU

I'am trying to convert a tf.keras model based on mobilenetv2 with transpose convolution using latest tf-nighlty. Here is the conversion code
#saved_model_dir='/content/ksaved' # tried from saved model also
#converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter = tf.lite.TFLiteConverter.from_keras_model(reshape_model)
converter.experimental_new_converter=True
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
converter.representative_dataset = representative_dataset_gen
tflite_quant_modell = converter.convert()
open("/content/model_quant.tflite", "wb").write(tflite_quant_modell)
The conversion was successful(in google colab); but it has quantize and dequantize operators at the ends(as seen using netron). All operators seems to be supported. Representative data set images are float32 in generator and the model has a 4 channel float32 input by default. It looks like we need a UINT8 input and output inside model for coral TPU. How can we properly carry out this conversion?
Ref:-
Full integer quantization of weights and activations
How to quantize inputs and outputs of optimized tflite model
Coral Edge TPU Compiler cannot convert tflite model: Model not quantized
I tried with 'tf.compat.v1.lite.TFLiteConverter.from_keras_model_file' instead of v2 version.I got error: "Quantization not yet supported for op: TRANSPOSE_CONV" while trying to quantize the model in latest tf 1.15 (using representative dataset) and "Internal compiler error. Aborting! " from coral tpu compiler using tf2.0 quantized tflite
Tflite model # https://github.com/tensorflow/tensorflow/issues/31368
It seems to work until the last constitutional block (1x7x7x160)
The compiler error(Aborting) does not give any information regarding the potential cause and all types of convolutional layers seems to be supported as per coral docs.
Coral doc: https://coral.ai/docs/edgetpu/models-intro/#quantization
Here is a dummy model example of quantizing a keras model. Notice I'm using strict tf1.15 for the example, because tf2.0 deprecated:
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
with the from_keras_model api. I think the most confusing thing about this is that you can still call it but nothing happens. This means that model will still take in float inputs. I notice that you are using tf2.0, because from_keras_model is a tf2.0 api. Coral still suggest using tf1.15 for converting model for now. I suggest downgrading tensorflow or maybe even just use this (while keeping tf2.0, it may or may not work):
tf.compat.v1.lite.TFLiteConverter.from_keras_model_file
More on it here.
I always make sure not to use the experimental converter:
converter.experimental_new_converter = False

Is there any way to convert a tensorflow lite (.tflite) file back to a keras file (.h5)?

I had lost my dataset by a careless mistake. I have only my tflite file left in my hand. Is there any solution to reverse back h5 file. I have been done decent research in this but no solutions found.
The conversion from a TensorFlow SaveModel or tf.keras H5 model to .tflite is an irreversible process. Specifically, the original model topology is optimized during the compilation by the TFLite converter, which leads to some loss of information. Also, the original tf.keras model's loss and optimizer configurations are discarded, because those aren't required for inference.
However, the .tflite file still contains some information that can help you restore the original trained model. Most importantly, the weight values are available, although they might be quantized, which could lead to some loss in precision.
The code example below shows you how to read weight values from a .tflite file after it's created from a simple trained tf.keras.Model.
import numpy as np
import tensorflow as tf
# First, create and train a dummy model for demonstration purposes.
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, input_shape=[5], activation="relu"),
tf.keras.layers.Dense(1, activation="sigmoid")])
model.compile(loss="binary_crossentropy", optimizer="sgd")
xs = np.ones([8, 5])
ys = np.zeros([8, 1])
model.fit(xs, ys, epochs=1)
# Convert it to a TFLite model file.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
open("converted.tflite", "wb").write(tflite_model)
# Use `tf.lite.Interpreter` to load the written .tflite back from the file system.
interpreter = tf.lite.Interpreter(model_path="converted.tflite")
all_tensor_details = interpreter.get_tensor_details()
interpreter.allocate_tensors()
for tensor_item in all_tensor_details:
print("Weight %s:" % tensor_item["name"])
print(interpreter.tensor(tensor_item["index"])())
These weight values loaded back from the .tflite file can be used with tf.keras.Model.set_weights() method, which will allow you to re-inject the weight values into a new instance of trainable Model that you have in Python. Obviously, this requires you to still have access to the code that defines the model's architecture.