Cannot load tflite model, Did not get operators or tensors in subgraph 1 - tensorflow

I have converted a tf model to tflite, and applied quantization in the process, but I cannot load it. The error was raised when I try to do interpreter = tf.lite.Interpreter(tflite_model_path), the error message was:
ValueError: Did not get operators or tensors in subgraph 1.
Also during quantization, I got lots of these INFO messages for every dense layer in my model:
2021-09-06 04:38:40.879693: I tensorflow/lite/tools/optimize/quantize_weights.cc:217] Skipping quantization of tensor bert_token_clssfification/classifier/Tensordot/Shape that is not type float.
These messages confuse me greatly, because I'm sure those weights are of type float32. Any ideas what I'm doing wrong? Thanks!

I figured out the cause in my case. It was because I have dropout layers in my model, and I'm using an input tf.bool tensor to explicitly control the training/inference mode of the dropouts layers. Dropout is not currently supported in tflite, and because I'm explicitly controlling the dropout behaviour, the tflite conversion cannot remove the dropout operations.
The correct way to use dropout is to pass the training kwarg during model invocation: out = model(input_batch, training=True).

Related

ValueError: Unknown layer: AnchorBoxes quantization tensorflow

I am applying quantization to a SSD model. The gist is attached. There is a custom object called "AnchorBoxes" which is added while loading the model. This works fine when I don't do quantization. But when I apply quantization, this custom object is not recognized.
I tried a work around.
def apply_quantization_to_conv2D(layer):
#print(layer)
if isinstance(layer, tf.keras.layers.Conv2D):
return tfmot.quantization.keras.quantize_annotate_layer(layer)
return layer
# Use `tf.keras.models.clone_model` to apply `apply_quantization_to_dense`
# to the layers of the model.
annotated_model = tf.keras.models.clone_model(
model,
clone_function=apply_quantization_to_conv2D,
)
#annotated_model.save('quantize_ready_model_20_01_Conv2D.h5', include_optimizer=True)
annotated_model.summary()
# Now that the Dense layers are annotated,
# `quantize_apply` actually makes the model quantization aware.
#quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)
I commented this line quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model) in the above code as it was throwing the error ValueError: Unknown layer: AnchorBoxes
Instead I saved the model after applying quantization to the Conv2D layers as below
def apply_quantization_to_conv2D(layer):
#print(layer)
if isinstance(layer, tf.keras.layers.Conv2D):
return tfmot.quantization.keras.quantize_annotate_layer(layer)
return layer
# Use `tf.keras.models.clone_model` to apply `apply_quantization_to_dense`
# to the layers of the model.
annotated_model = tf.keras.models.clone_model(
model,
clone_function=apply_quantization_to_conv2D,
)
annotated_model.summary()
annotated_model.save('quantize_ready_model_20_01_Conv2D_1.h5', include_optimizer=True)
# Now that the Dense layers are annotated,
# `quantize_apply` actually makes the model quantization aware.
#quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)
#quant_aware_model.compile(optimizer=adam, loss=ssd_loss.compute_loss)
#quant_aware_model.summary()
Then I loaded the model hoping that the loaded quantized model as below will have the custom_objects attached to it.
with tfmot.quantization.keras.quantize_scope():
loaded_model = tf.keras.models.load_model('./quantize_ready_model_20_01_Conv2D_1.h5', custom_objects={'AnchorBoxes': AnchorBoxes})
Finally I applied the quantize_apply to the new loaded_model which has quantized layers.
quant_aware_model = tfmot.quantization.keras.quantize_apply(loaded_model)
which again resulted in the same error
ValueError: Unknown layer: AnchorBoxes
System information
TensorFlow version (installed from source or binary): TF 2.0.0
TensorFlow Model Optimization version (installed from source or binary): 0.5.0
Describe the expected behavior
When I run quantize_apply(model), the model should become quantization aware
Describe the current behavior
Throwing an error on the custom objects
Code to reproduce the issue
gist
The issue was fixed after passing the custom layer like this AnchorBoxes': AnchorBoxes in the below code.
with quantize_scope(
{'DefaultDenseQuantizeConfig': DefaultDenseQuantizeConfig,
'AnchorBoxes': AnchorBoxes}):
# Use `quantize_apply` to actually make the model quantization aware.
quant_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)

Fused activation functions of conv ops should be NONE but they are RELU6

I am trying to convert a frozen mobilenet_v1 graph to tflite. I'm interested in the conversion process and not just the quantized tflite model, so I'm comparing my converted model to one already made by tensorflow. The difference between them is that my conv2d nodes have a relu6 activation function while the correct model's conv2d nodes don't have any activation function. How should I go about fixing this problem?
I should also mention that I'm converting to a fully quantized tflite model by using tensorflow's quantization-aware training and activating the quantization flags upon conversion to tflite.

What does `training=True` mean when calling a TensorFlow Keras model?

In TensorFlow's offcial documentations, they always pass training=True when calling a Keras model in a training loop, for example, logits = mnist_model(images, training=True).
I tried help(tf.keras.Model.call) and it shows that
Help on function call in module tensorflow.python.keras.engine.network:
call(self, inputs, training=None, mask=None)
Calls the model on new inputs.
In this case `call` just reapplies
all ops in the graph to the new inputs
(e.g. build a new computational graph from the provided inputs).
Arguments:
inputs: A tensor or list of tensors.
training: Boolean or boolean scalar tensor, indicating whether to run
the `Network` in training mode or inference mode.
mask: A mask or list of masks. A mask can be
either a tensor or None (no mask).
Returns:
A tensor if there is a single output, or
a list of tensors if there are more than one outputs.
It says that training is a Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode. But I didn't find any information about this two modes.
In a nutshell, I don't know what is the influence of this argument. And what if I missed this argument when training?
Some neural network layers behave differently during training and inference, for example Dropout and BatchNormalization layers. For example
During training, dropout will randomly drop out units and correspondingly scale up activations of the remaining units.
During inference, it does nothing (since you usually don't want the randomness of dropping out units here).
The training argument lets the layer know which of the two "paths" it should take. If you set this incorrectly, your network might not behave as expected.
Training indicating whether the layer should behave in training mode or in inference mode.
training=True: The layer will normalize its inputs using the mean and variance of the current batch of inputs.
training=False: The layer will normalize its inputs using the mean and variance of its moving statistics, learned during training.
Usually in inference mode training=False, but in some networks such as pix2pix_cGAN‍‍‍‍‍‍ At both times of inference and training, training=True.

Deployment of keras layer UpSampling2D to tensorRT

Kears/TensorFlow layer UpSampling2D() cannot be deployed to TensorRT (known behavior).
I am trying to find a solution by replacing the layer UpSampling2D() by other Keras layer with parallel behaviour.
Theoretically Conv2DTranspose() should do the work, by setting specific weights and fixing the weights of the layers in training.
I am looking for some help on how to do that.
I did a test run by replacing all the UpSampling 2D() with Conv2DTranspose() in my model and then converted it to UFF. (I only trained the model for 1 epoch to save time).
The converter then complained about DataFormatVecPermute instead.
Converting conv2d_transpose_1/conv2d_transpose-0-VecPermuteNHWCToNCHW-LayoutOptimizer as custom op: DataFormatVecPermute
Warning: No conversion function registered for layer: DataFormatVecPermute yet.
And the parser in C++ couldn't parse this model successfully either.

Mask R-CNN for TPU on Google Colab

We are trying to build an image segmentation deep learning model using Google Colab TPU. Our model is Mask R-CNN.
TPU_WORKER = 'grpc://' + os.environ['COLAB_TPU_ADDR']
import tensorflow as tf
tpu_model = tf.contrib.tpu.keras_to_tpu_model(
model.keras_model,
strategy=tf.contrib.tpu.TPUDistributionStrategy(
tf.contrib.cluster_resolver.TPUClusterResolver(TPU_WORKER)))
However I am running into issues while converting our Mask R-CNN model to TPU model as pasted below.
ValueError:
Layer <keras.engine.topology.InputLayer object at 0x7f58574f1940> has a
variable shape in a non-batch dimension. TPU models must
have constant shapes for all operations.
You may have to specify `input_length` for RNN/TimeDistributed layers.
Layer: <keras.engine.topology.InputLayer object at 0x7f58574f1940>
Input shape: (None, None, None, 3)
Output shape: (None, None, None, 3)
Appreciate any help.
Google recently released a tutorial on getting Mask R-CNN going on their TPUs. For this, they are using an experimental model for Mask RCNN on Google's TPU github repository (under models/experimental/mask_rcnn). Looking through the code, it looks like they define the model with a fixed input size to overcome the issue you are seeing.
See below for more explanation:
As #aman2930 points out, the shape of your input tensor is not static. This won't work because Tensorflow compiles models with XLA to use a TPU and XLA must have all tensor shapes defined at compile time. In the link above, the website specifically calls this out:
Static shapes
During regular usage TensorFlow attempts to determine the shapes of each tf.Tensor during graph construction. During
execution any unknown shape dimensions are determined dynamically, see
Tensor Shapes for more details.
To run on Cloud TPUs TensorFlow models are compiled using XLA. XLA
uses a similar system for determining shapes at compile time. XLA
requires that all tensor dimensions be statically defined at compile
time. All shapes must evaluate to a constant, and not depend on
external data, or stateful operations like variables or a random
number generator.
That side, further down the document, they mention that the input function is run on the CPU, so isn't limited by static XLA sizes. They point to batch size being the issue, not image size:
Static shapes and batch size
The input pipeline generated by your
input_fn is run on CPU. So it is mostly free from the strict static
shape requirements imposed by the XLA/TPU environment. The one
requirement is that the batches of data fed from your input pipeline
to the TPU have a static shape, as determined by the standard
TensorFlow shape inference algorithm. Intermediate tensors are free to
have a dynamic shapes. If shape inference has failed, but the shape is
known it is possible to impose the correct shape using tf.set_shape().
So you could fix this by reformulating your model to have fixed batch size or to use
tf.contrib.data.batch_and_drop_remainder as they suggest.
Could you please share the input data function. It is hard to tell the exact issue, but it seems that the shape of tensor representing input sample is not static.