Why am I getting this error? valueerror: output tensor at index 0 is expected to have 3 dimensions, found 2 - object-detection

My project is based on object detection using tflite on raspberry pi4, the code works perfectly without the coral accelerator however when I try to speed the detection using google coral accelerator I get this error: (I run the code using google collab)
valueerror: output tensor at index 0 is expected to have 3 dimensions, found 2.
This is the full code using google collaboration:
https://colab.research.google.com/github/khanhlvg/tflite_raspberry_pi/blob/main/object_detection/Train_custom_model_tutorial.ipynb
Please give me detailed solution, thanks:)
I tried changing the tflite version to 2.5.0 but the code didn’t work.

Related

How would I convert this TensorFlow image classification model to Core ML?

I’m learning TensorFlow and want to convert an image classification model to Core ML for use in an iOS app.
This TensorFlow image classification tutorial is a close match to what I want to do for the training, but I haven’t been able to figure out how to convert that to Core ML.
Here’s what I’ve tried, adding the following to the end of the Colab notebook for the tutorial:
# install coremltools
!pip install coremltools
# import coremltools
import coremltools as ct
# define the input type
image_input = ct.ImageType()
# create classifier configuration with the class labels
classifier_config = ct.ClassifierConfig(class_names)
# perform the conversion
coreml_model = ct.convert(
model, inputs=[image_input], classifier_config=classifier_config,
)
# print info about the converted model
print(coreml_model)
# save the file
coreml_model.save('my_coreml_model')
That successfully creates an mlmodel file, but when I download the file and open it in Xcode to test it (under the “Preview” tab) it shows results like “Roses 900% Confidence” and “Tulips 1,120% Confidence”. For my uses, the confidence percentage needs to be from 0 to 100%, so I think I’m missing some parameter for the conversion.
On import coremltools as ct I do get some warnings like “WARNING:root:TensorFlow version 2.8.2 has not been tested with coremltools. You may run into unexpected errors.” but I’m guessing that’s not the problem since the conversion doesn’t report any errors.
Based on information here, I’ve also tried setting a scale on the image input:
image_input = ct.ImageType(scale=1/255.0)
… but that made things worse as it then has around 315% confidence that every image is a dandelion. A few other attempts at setting a scale / bias all resulted in the same thing.
At this point I’m not sure what else to try. Any help is appreciated!
The last layer of your model should be something like this:
layers.Dense(num_classes, activation='softmax')
The softmax function transforms your output into the probabilities you need.

TF Yamnet Transfer Learning and Quantization

TLDR:
Short term: Trying to quantize a specific portion of a TF model (recreated from a TFLite model). Skip to pictures below. \
Long term: Transfer Learn on Yamnet and compile for Edge TPU.
Source code to follow along is here
I've been trying to transfer learn on Yamnet and compile for a Coral Edge TPU for a few weeks now.
Started here, but quickly realized that model wouldn't quantize and compile for the Edge TPU because of the dynamic input and out of the box TFLite quantization doesn't work well with the preprocessing of audio before Yamnet's MobileNet.
After tinkering and learning for a few weeks, I found a Yamnet model compiled for the Edge TPU (sadly without source code) and figured my best shot would be to try to recreate it in TF, then quantize, then compile to TFLite, then compile for the edge TPU. I'll also have to figure out how to set the weights - not sure if I have to/can do that pre or post quantization. Anyway, I've effectively recreated the model, but am having a hard time quantizing without a bunch of wacky behavior.
The model currently looks like this:
I want it to look like this:
For quantizing, I tried:
TFLite Model Optimization which puts tfl.quantize ops all over the place and fails to compile for the Edge TPU.
Quantization Aware Training which throws some annoying errors that I've been trying to work through.
If you know a better way to achieve the long term goal than what I proposed, please (please please please) share! Otherwise, help on specific quant ops would be great! Also, reach out for clarity
I've ran into your same issues trying to convert the Yamnet model by tensorflow into full integers in order to compile it for Coral edgetpu and I think I've found a workaround for that.
I've been trying to stick to the tutorials posted in the section tflite-model-maker and finding a solution within this API because, for experience, I found it to be a very powerful tool.
If your goal is to build a model which is fully compiled for the edgetpu (meaning all layers, including input and output ones, being converted to int8 type) I'm afraid this solution won't fit for you. But since you posted you're trying to obtain a custom model with the same structure of:
Yamnet model compiled for the Edge TPU
then I think this workaround would help you.
When you train your custom model following the basic tutorial it is possible to export the custom model both in .tflite format
model.export(models_path, tflite_filename='my_birds_model.tflite')
and full tensorflow model:
model.export(models_path, export_format=[mm.ExportFormat.SAVED_MODEL, mm.ExportFormat.LABEL])
Then it is possible to convert the full tensorflow saved model to tflite format by using the following script:
import tensorflow as tf
import numpy as np
import glob
from scipy.io import wavfile
dataset_path = '/path/to/DATASET/testing/*/*.wav'
representative_data = []
saved_model_path = './saved_model'
samples = glob.glob(dataset_path)
input_size = 15600 #Yamnet model's input size
def representative_data_gen():
for input_value in samples:
sample_rate, audio_data = wavfile.read(input_value, 'rb')
audio_data = np.array(audio_data)
splitted_audio_data = tf.signal.frame(audio_data, input_size, input_size, pad_end=True, pad_value=0) / tf.int16.max #normalization in [-1,+1] range
yield [np.float32(splitted_audio_data[0])]
tf.compat.v1.enable_eager_execution()
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_path)
converter.experimental_new_converter = True #if you're using tensorflow<=2.2
converter.optimizations = [tf.lite.Optimize.DEFAULT]
#converter.inference_input_type = tf.uint8 # or tf.uint8
#converter.inference_output_type = tf.uint8 # or tf.uint8
converter.representative_dataset = representative_data_gen
tflite_model = converter.convert()
open(saved_model_path + "converted_model.tflite", "wb").write(tflite_model)
As you can see, the lines which tell the converter to change input/output type are commented. This is because Yamnet model expects in input normalized values of audio sample in the range [-1,+1] and the numerical representation must be float32 type. In fact the compiled model of Yamnet you posted uses the same dtype for input and output layers (float32).
That being said you will end up with a tflite model converted from the full tensorflow model produced by tflite-model-maker. The script will end with the following line:
fully_quantize: 0, inference_type: 6, input_inference_type: 0, output_inference_type: 0
and the inference_type: 6 tells you the inference operations are suitable for being compiled to coral edgetpu.
The last step is to compile the model. If you compile the model with the standard edgetpu_compiler command line :
edgetpu_compiler -s converted_model.tflite
the final model would have only 4 operations which run on the EdgeTPU:
Number of operations that will run on Edge TPU: 4
Number of operations that will run on CPU: 53
You have to add the optional flag -a which enables multiple subgraphs (it is in experimental stage though)
edgetpu_compiler -sa converted_model.tflite
After this you will have:
Number of operations that will run on Edge TPU: 44
Number of operations that will run on CPU: 13
And most of the model operations will be mapped to edgetpu, namely:
Operator Count Status
MUL 1 Mapped to Edge TPU
DEQUANTIZE 4 Operation is working on an unsupported data type
SOFTMAX 1 Mapped to Edge TPU
GATHER 2 Operation not supported
COMPLEX_ABS 1 Operation is working on an unsupported data type
FULLY_CONNECTED 3 Mapped to Edge TPU
LOG 1 Operation is working on an unsupported data type
CONV_2D 14 Mapped to Edge TPU
RFFT2D 1 Operation is working on an unsupported data type
LOGISTIC 1 Mapped to Edge TPU
QUANTIZE 3 Operation is otherwise supported, but not mapped due to some unspecified limitation
DEPTHWISE_CONV_2D 13 Mapped to Edge TPU
MEAN 1 Mapped to Edge TPU
STRIDED_SLICE 2 Mapped to Edge TPU
PAD 2 Mapped to Edge TPU
RESHAPE 1 Operation is working on an unsupported data type
RESHAPE 6 Mapped to Edge TPU

Inference of Audio data on HuggingFace Wav2Vec TFlite Model

I've Been trying to run inference on an audio sample with a tflite version of Wav2Vec2.0 model that got from HuggingFace. When I try and run inference on the model, this is the error I get
RuntimeError: Index out of range using input dim 1; input has only 1 dims
(while executing 'StridedSlice' via Eager)Node number 1491 (TfLiteFlexDelegate) failed to invoke.
A Colab Notebook with the TFLite conversion code and the used inference code can be found here
Thank You!

Output tensor size mismatch with SSD FPN Models on mobile

I am trying to get different SSD Models from Tensorflow 2 to work in Android. The example .tflite file works fine, but different other models i've converted and tried all seem to have the same issue.
I've mostly followed this guide on how to convert those models for mobile: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_on_mobile_tf2.md
And that all worked fine, but when trying to run this on mobile the first frame checked throws this error:
Error occurred when initializing ObjectDetector: Output tensor at
index 0 is expected to have 3 dimensions, found 2.
Models i've tried:
SSD MobileNet V2 FPNLite 640x640
SSD ResNet152 V1 FPN 1024x1024 (RetinaNet152)
SSD ResNet101 V1 FPN 640x640 (RetinaNet101)
Does anyone have any idea what i might be doing wrong?
EDIT: I've found a similiar issue which was opened a few days ago. Might be related: https://github.com/tensorflow/tensorflow/issues/51591#issuecomment-905216674

Error with 8-bit Quantization in Tensorflow

I have been experimenting with the new 8-bit quantization feature available in TensorFlow. I could run the example given in the blog post (quantization of googlenet) without any issue and it works fine for me !!!
Now, I would like to apply the same for a simpler network. So I used a pre-trained network for CIFAR-10 (which is trained on Caffe), extracted its parameters, created corresponding graph in tensorflow, initialized the weights with this pre-trained weights and finally saved it as a GraphDef object. See this IPython Notebook for full procedure.
Now I applied the 8-bit quantization with the tensorflow script as mentioned in the Pete Warden's blog:
bazel-bin/tensorflow/contrib/quantization/tools/quantize_graph --input=cifar.pb --output=qcifar.pb --mode=eightbit --bitdepth=8 --output_node_names="ArgMax"
Now I wanted to run the classification on this quantized network. So I loaded the new qcifar.pb to a tensorflow session and passed the image (the same way I passed it to original version). Full code can be found in this IPython Notebook.
But as you can see at the end, I am getting following error:
NotFoundError: Op type not registered 'QuantizeV2'
Can anybody suggest what am I missing here?
Because the quantized ops and kernels are in contrib, you'll need to explicitly load them in your python script. There's an example of that in the quantize_graph.py script itself:
from tensorflow.contrib.quantization import load_quantized_ops_so
from tensorflow.contrib.quantization.kernels import load_quantized_kernels_so
This is something that we should update the documentation to mention!