TensorFlow Lite Error: "(RESHAPE) failed to prepare" when running converted RepNet model in TFLite in Flutter - tensorflow2.0

I'm trying to run a converted RepNet model in TFLite on mobile (iOS and Android) using Flutter, and the tf_lite_flutter package.
I have successfully converted the model to TFLite by adapting the colab provided by the authors. The notebook I used can be found in this repository, along with the converted model.
To make sure everything was working before attempting to run on an edge device, I checked everything with the Python TFLite API (this notebook). Everything indeed worked well - the output for the TFLite model matches the output of the Google colab provided by the authors.
I created a Flutter project to get the model running on mobile. I've tried passing in the default input and output, resulting from calls to interpreter.getInputTensors() and interpreter.getOutputTensors() respectively. When using this project to try to run the model, I encounter the following error:
E/tflite (26540): tensorflow/lite/kernels/reshape.cc:69 num_input_elements != num_output_elements (1 != 0)
E/tflite (26540): Node number 1 (RESHAPE) failed to prepare.
I'm admittedly pretty new to Tensorflow and Tensorflow Lite, so my debugging ability is somewhat limited. It does seem strange to me that the expected output shape is 0. Considering it is working with the Python API, I'm not sure why it isn't working on-device. The only thing I might suspect it could be is the batch_size not being configured properly. Using the shape_signature field, as in interpreter.get_input_details()[0]['shape_signature'], I can see that the batch size is dynamic (value -1).
The model was converted using Tensorflow==2.5 in Python, and is being run using the standard TFLite 2.5 binaries (no GPUDelegate).
Any suggestions for fixing this error would be appreciated!

Related

Freeze Saved_Model.pb created from converted Keras H5 model

I am currently trying to train a custom model for use in Unity (Barracuda) for object detection and I am struggling near what I believe to be the last part of the pipeline. Following various tutorials and git-repos I have done the following...
Using Darknet, I have trained a custom-model using the Tiny-Yolov2 model. (model tested successfully on a webcam python script)
I have taken the final weights from that training and converted them
to a Keras (h5) file. (model tested successfully on a webcam python
script)
From Keras, I then use tf.save_model to turn it into a
save_model.pd.
From save_model.pd I then convert it using tf2onnx.convert to change
it to an onnx file.
Supposedly from there it can then work in one of a few Unity sample
projects...
...however, this project fails to read in the Unity Sample projects I've tried to use. From various posts it seems that I may need to use a 'frozen' save_model.pd before converting it to ONNX. However all the guides and python functions that seem to be used for freezing save_models require a lot more arguments than I have awareness of or data for after going through so many systems. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py - for example, after converting into Keras, I only get left with a h5 file, with no knowledge of what an input_graph_def, or output_node_names might refer to.
Additionally, for whatever reason, I cannot find any TF version (1 or 2) that can successfully run this python script using 'from tensorflow.python.checkpoint import checkpoint_management' it genuinely seems like it not longer exists.
I am not sure why I am going through all of these conversions and steps but every attempt to find a cleaner process between training and unity seemed to lead only to dead ends.
Any help or guidance on this topic would be sincerely appreciated, thank you.

Object Detection Few-Shot training with TensorflowLite

I am trying to create a mobile app that uses object detection to detect a specific type of object. To do this I am starting with the Tensorflow object detection example Android app, which uses TF2 and ssd_mobilenet_v1.
I'd like to try Few-Shot training (Colab link) so I started by replacing the example app's SSD Mobilenet v1 download with the Colab's output file model.tflite, however this causes the the app to crash with following error:
java.lang.IllegalStateException: This model does not contain associated files, and is not a Zip file.
at org.tensorflow.lite.support.metadata.MetadataExtractor.assertZipFile(MetadataExtractor.java:313)
at org.tensorflow.lite.support.metadata.MetadataExtractor.getAssociatedFile(MetadataExtractor.java:164)
at org.tensorflow.lite.examples.detection.tflite.TFLiteObjectDetectionAPIModel.create(TFLiteObjectDetectionAPIModel.java:126)
at org.tensorflow.lite.examples.detection.DetectorActivity.onPreviewSizeChosen(DetectorActivity.java:99)
I realize the Colab uses ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz - does this mean there are changes needed in the app code - or is there something more fundamentally wrong with my approach?
Update: I also tried the Lite output of the Colab tf2_image_retraining and got the same error.
The fix apparently was https://github.com/tensorflow/examples/compare/master...cachvico:darren/fix-od - .tflite files can now be zip files including the labels, but the example app doesn't work with the old format.
This doesn't throw error when using the Few Shots colab output. Although I'm not getting results yet - pointing the app at pictures of rubber ducks not yet work.

tfRecords shown faulty in TF2

I have a couple of own tfrecord file made by myself.
They are working perfectly in tf1, I used them in several projects.
However if i want to use them in Tensorflow Object Detection API with tf2 (running the model_main_tf2.py script), I see the following in tensorboard:
tensorboard images tab
It totally masses up the images.
(Running the /work/tfapi/research/object_detection/model_main.py script or even legacy_train and they looks fine)
Is tf2 using different kind of encoding in tfrecords?
Or what can cause such results?

Workflow for converting .pb files to .tflite

Overview
I know this subject has been discussed many times, but I am having a hard time understanding the workflow, or rather, the variations of the workflow.
For example, imagine you are installing TensorFlow on Windows 10. The main goal being to train a custom model, convert to TensorFlow Lite, and copy the converted .tflite file to a Raspberry Pi running TensorFlow Lite.
The confusion for me starts with the conversion process. After following along with multiple guides, it seems TensorFlow is often install with pip, or Anaconda. But then I see detailed tutorials which indicate it needs to be built from source in order to convert from TensorFlow models to TFLite models.
To make things more interesting, I've also seen models which are converted via Python scripts as seen here.
Question
So far I have seen 3 ways to do this conversion, and it could just be that I don't have a grasp on the full picture. Below are the abbreviated methods I have seen:
Build from source, and use the TensorFlow Lite Optimizing Converter (TOCO):
bazel run --config=opt tensorflow/lite/toco:toco -- --input_file=$OUTPUT_DIR/tflite_graph.pb --output_file=$OUTPUT_DIR/detect.tflite ...
Use the TensorFlow Lite Converter Python API:
converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
tflite_model = converter.convert()
with tf.io.gfile.GFile('model.tflite', 'wb') as f:
f.write(tflite_model)
Use the tflite_convert CLI utilities:
tflite_convert --saved_model_dir=/tmp/mobilenet_saved_model --output_file=/tmp/mobilenet.tflite
I *think I understand that options 2/3 are the same, in the sense that the tflite_convert utility is installed, and can be invoked either from the command line, or through a Python script. But is there a specific reason you should choose one over the other?
And lastly, what really gets me confused is option 1. And maybe it's a version thing (1.x vs 2.x)? But what's the difference between the TensorFlow Lite Optimizing Converter (TOCO) and the TensorFlow Lite Converter. It appears that in order to use TOCO you would have to build TensorFlow from source, so is there is a reason you would use one over the other?
There is no difference in the output from different conversion methods, as long as the parameters remain the same. The Python API is better if you want to generate TFLite models in an automated way (for eg a Python script that's run periodically).
The TensorFlow Lite Optimizing Converter (TOCO) was the first version of the TF->TFLite converter. It was recently deprecated and replaced with a new converter that can handle more ops/models. So I wouldn't recommend using toco:toco via bazel, but rather use tflite_convert as mentioned here.
You should never have to build the converter from source, unless you are making some changes to it and want to test them out.

Convert PoseNet TensorFlow.js params to TensorFlow Lite

I'm fairly new to TensorFlow so I apologize if I'm saying something absurd.
I've been playing with the PoseNet model in the browser using TensorFlow.js. In this project, I can change the algorithm and parameters so I can get better results on the detection of certain poses. The most important params in my use case are the Multiplier, Quant Bytes and Output Stride.
So far so good, I have the results I want. However, I want to convert these results to TensorFlow Lite so I can use it in an iOS application. I managed to find the PoseNet model in a TensorFlow Lite file (tflite) and I even found an iOS app example provided by TensorFlow to I'm able to load up the model file and have it working on iOS.
The problem is...I'm unable to change the params (Multiplier, Quant Bytes and Output Stride) on the iOS app. I can't find it anywhere how I can do this. I've tried searching for these params in the iOS app source code, I've tried to find ways to convert a TensorFlow.js model to TensorFlow Lite so I can load the model with the params I want in the app but no luck.
I'm writing this post so maybe you guys can point me in the right direction so I'm able to "translate" what I have on TensorFlow.js to TensorFlow Lite.
EDIT:
This is what I've learned in the last couple of days:
TFLite is designed for serving fixed model with lightweight runtime. Thus, modifying model parameters on demand is not a design goal for it.
I looked at the TF.js code for PoseNet, and found similar design. It seems you can modify parameters, because they actually have different models for each params. https://github.com/tensorflow/tfjs-models/blob/b72c10bdbdec6b04a13f780180ed904736fa52a5/posenet/src/checkpoints.ts#L37
TFLite models generally don't support dynamic parameters. Output stride Multiplier and Quant Bytes are fixed params when the neural network is created.
So what I want to do is to extract weights from TF.js model, and put then into existing MobileNet code.
And that's where I need help now. Could anyone point me in the direction to load and change the model so I can then convert it to tflite with my own params?
EDIT2:
I found a repo that is helping me convert TF.js models to TF Lite Griffin98/posenet_tfjs2tflite. I still can't define the Quant Bytes tho.