I am converting several models from Tensorflowsj Keras and Tensorflow to TensorflowLite and then to TensorflowMicro c-header files.
I can do the main conversions but have found little information about using tflite_convert for quantization.
Wondering if people could post working command line examples. As far as I can tell we are encouraged to use python to do the conversions, but I would prefer to stay on the command line.
I have summarized what I am working on here https://github.com/hpssjellis/my-examples-for-the-arduino-portentaH7/tree/master/m09-Tensoflow/tfjs-convert-to-arduino-header.
This is what I have so far and it works converting a saved tensorfowjs model.json into a .pb file that is converted to a .tflite and then to a c-header file to work on an Arduino style microcontroller.
tensorflowjs_converter --input_format=tfjs_layers_model --output_format=keras_saved_model ./model.json ./
tflite_convert --keras_model_file ./ --output_file ./model.tflite
xxd -i model.tflite model.h
But my files do not get any smaller when I try any quantization.
The tflit_convert command line help at Tensorflow is not specific enough https://www.tensorflow.org/lite/convert/cmdline
Here are some examples I have found using both tflite_convert or tensorflowjs_convert, some seem to work on other peoples models but do not seem to work on my own models:
tflite_convert --output_file=/home/wang/Downloads/deeplabv3_mnv2_pascal_train_aug/optimized_graph.tflite --graph_def_file=/home/wang/Downloads/deeplabv3_mnv2_pascal_train_aug/frozen_inference_graph.pb --inference_type=FLOAT --inference_input_type=QUANTIZED_UINT8 --input_arrays=ImageTensor --input_shapes=1,513,513,3 --output_arrays=SemanticPredictions –allow_custom_ops
tflite_convert --graph_def_file=<your_frozen_graph> \
--output_file=<your_chosen_output_location> \
--input_format=TENSORFLOW_GRAPHDEF \
--output_format=TFLITE \
--inference_type=QUANTIZED_UINT8 \
--output_arrays=<your_output_arrays> \
--input_arrays=<your_input_arrays> \
--mean_values=<mean of input training data> \
--std_dev_values=<standard deviation of input training data>
tflite_convert --graph_def_file=/tmp/frozen_cifarnet.pb \
--output_file=/tmp/quantized_cifarnet.tflite \
--input_format=TENSORFLOW_GRAPHDEF \
--output_format=TFLITE \
--inference_type=QUANTIZED_UINT8 \
--output_arrays=CifarNet/Predictions/Softmax \
--input_arrays=input \
--mean_values 121 \
--std_dev_values 64
tflite_convert
--graph_def_file=frozen_inference_graph.pb
--output_file=new_graph.tflite
--input_format=TENSORFLOW_GRAPHDEF
--output_format=TFLITE
--input_shape=1,600,600,3
--input_array=image_tensor
--output_array=detection_boxes,detection_scores,detection_classes,num_detections
--inference_type=QUANTIZED_UINT8
--inference_input_type=QUANTIZED_UINT8
--mean_values=128 \
--std_dev_values=127
tflite_convert --graph_def_file=~YOUR PATH~/yolov3-tiny.pb --output_file=~YOUR PATH~/yolov3-tiny.tflite --input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE --input_shape=1,416,416,3 --input_array=~YOUR INPUT NAME~ --output_array=~YOUR OUTPUT NAME~ --inference_type=FLOAT --input_data_type=FLOAT
tflite_convert \ --graph_def_file=built_graph/yolov2-tiny.pb \ --output_file=built_graph/yolov2_graph.lite \ --input_format=TENSORFLOW_GRAPHDEF \ --output_format=TFLITE \ --input_shape=1,416,416,3 \ --input_array=input \ --output_array=output \ --inference_type=FLOAT \ --input_data_type=FLOAT
tflite_convert --graph_def_file=frozen_inference_graph.pb --output_file=optimized_graph.lite --input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE --input_shape=1,1024,1024,3 --input_array=image_tensor --output_array=Softmax
tensorflowjs_converter --quantize_float16 --input_format=tf_hub 'https://tfhub.dev/google/imagenet/mobilenet_v1_100_224/classification/1' ./
tensorflowjs_converter --control_flow_v2=True --input_format=tf_hub --quantize_uint8=* --strip_debug_ops=True --weight_shard_size_bytes=4194304 --output_node_names='Postprocessor/ExpandDims_1,Postprocessor/Slice' --signature_name 'serving_default' https://tfhub.dev/tensorflow/ssd_mobilenet_v2/2 test
If anyone has working examples of quantization that they can explain especially what is important to include and what is optional, that would be very helpful. I use netron to visualize the models so I should be able to see when a float input has been changed to an int8. A bit of an explanation would be helpful to.
Recently tried this set of commands to make which compiled but the quantized file was larger than the un-quantized file
tensorflowjs_converter --input_format=tfjs_layers_model --output_format=keras_saved_model ./model.json ./saved_model
tflite_convert --keras_model_file ./saved_model --output_file ./model.tflite
xxd -i model.tflite model.h
tflite_convert --saved_model_dir=./saved_model \
--output_file=./model_int8.tflite \
--input_format=TENSORFLOW_GRAPHDEF \
--output_format=TFLITE \
--inference_type=QUANTIZED_UINT8 \
--output_arrays=1,1 \
--input_arrays=1,2 \
--mean_value=128 \
--std_dev_value=127
xxd -i model_int8.tflite model_int8.h
The python way is easy as well. And you can find official examples here:
https://www.tensorflow.org/lite/performance/post_training_quantization
There is an entire section for this. I think you didn't train the model so post-training quantization is what you are looking for.
Related
OS: Ubuntu 18.04,
Tensorflow model:ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03
I have retrained the ssd_mobilenet_v1_quantized_coco model using my own data. I have successfully generated the frozen_inference_graph.pb, using the script, "export_inference_graph.py." But when I ran the script, "tflite_convert.py," the error, "ValueError: Invalid tensors 'normalized_input_image_tensor' were found." broke out. The parameters of the script, "tflite_convert.py" is
python tflite_convert.py \
--output_file="converted_quant_traffic_tflite/traffic_tflite_graph.tflite" \
--graph_def_file="traffic_inference_graph_lite/frozen_inference_graph.pb" \
--input_arrays='normalized_input_image_tensor' \
--inference_type=QUANTIZED_UINT8 \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
--mean_values=128 \
--std_dev_values=128 \
--input_shapes=1,300,300,3 \
--default_ranges_min=0 \
--default_ranges_max=6 \
--change_concat_input_ranges=false \
--allow_nudging_weights_to_use_fast_gemm_kernel=true \
--allow_custom_ops
Obviously, the input_arrays was not set correctly. Please advise me how to set the input_arrays.
I use a quantified model of deeplab v3 on coral dev board. The result is not really precise so I want to use a non quantified model.
I can find a tflite non quantified model of deeplab so I want to generate it.
I donwload a xception65_coco_voc_trainval model on Tensorflow github:
xception65_coco_voc_trainval
Then I use that command to tranform the input:
bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph="/path/to/xception65_coco_voc_trainval.pb" \
--out_graph="/path/to/xception65_coco_voc_trainval_flatten.pb" \
--inputs='ImageTensor' \
--outputs='SemanticPredictions' \
--transforms='
strip_unused_nodes(type=quint8, shape="1,513,513,3")
flatten_atrous_conv
fold_constants(ignore_errors=true, clear_output_shapes=false)
fold_batch_norms
fold_old_batch_norms
remove_device
sort_by_execution_order
Then I generate the tflite file with this command:
tflite_convert \
--graph_def_file="/tmp/deeplab_mobilenet_v2_opt_flatten_static.pb" \
--output_file="/tmp/deeplab_mobilenet_v2_opt_flatten_static.tflite" \
--output_format=TFLITE \
--input_shape=1,513,513,3 \
--input_arrays="ImageTensor" \
--inference_type=FLOAT \
--inference_input_type=QUANTIZED_UINT8 \
--std_dev_values=128 \
--mean_values=128 \
--change_concat_input_ranges=true \
--output_arrays="SemanticPredictions" \
--allow_custom_ops
This command generates me a tflite file. I donwload the file on my coral dev board and I try to run it with.
I use that example on github to try my deeplab model on coral:
deeplab on coral
When I start the program there is an error:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted
The error comes from that line:
engine = BasicEngine(args.model)
I'm trying to quantize the ssd_mobilenetv2_oidv4 model from Tensorflow object detection model zoo, but after quantization the model stops working entirely.
To get the tflite graph, I ran
export_tflite_ssd_graph.py \
--pipeline_config_path=$CONFIG_FILE \
--trained_checkpoint_prefix=$CHECKPOINT_PATH \
--output_directory=$OUTPUT_DIR \
--add_postprocessing_op=true
Then to generate the tflite file, I ran
tflite_convert \
--graph_def_file=$OUTPUT_DIR/tflite_graph.pb \
--output_file=$OUTPUT_DIR/detect.tflite \
--input_shapes=1,300,300,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
--inference_type=QUANTIZED_UINT8 \
--mean_values=128 \
--std_dev_values=128 \
--change_concat_input_ranges=false \
--allow_custom_ops \
--default_ranges_min=0 \
--default_ranges_max=6
Then I used this example android app to test it. When I try running it, it just shows 10 bounding that never move that are apparently detecting a tortoise with 50% accuracy. I'm not sure what all that means, but Tortoise is the first class in the label map if that's relevant.
Anyone know what's going on?
Here's a screenshot of the quantized model in action:
I successfully retrained mobilenet quantized model (architecture="mobilenet_1.0_128_quantized") with my own image dataset:
python3 -m scripts.retrain \
--bottleneck_dir=tf_files/bottlenecks_quant \
--how_many_training_steps=50000 \
--model_dir=tf_files/models/ \
--summaries_dir=tf_files/training_summaries/"mobilenet_1.0_128_quant" \
--output_graph=tf_files/retrained_graph_50000_1.0_128.pb \
--output_labels=tf_files/retrained_labels.txt \
--architecture="mobilenet_1.0_128_quantized" \
--image_dir=images
When I try to convert .pb file to .tflite using
toco \
--graph_def_file=tf_files/retrained_graph_50000_1.0_128.pb \
--output_file=tf_files/retrained_graph_50000_1.0_128.tflite \
--input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE \
--inference_type=QUANTIZED_UINT8 \
--input_shape="1,128,128,3" \
--input_array=input \
--output_array=outputs \
--std_dev_values=127.5 --mean_value=127.5
It fails with the next error:
ValueError: Invalid tensors 'outputs' were found.
I want to convert to TFLite format
toco \--input_file=$TRAINING_DIR/retrained_graph.pb \
--input_format=TENSORFLOW_GRAPHDEF \
--output_format=TFLITE \
--output_file=/$TRAINING_DIR/${ARCHITECTURE}.tflite \
--inference_type=QUANTIZED_UINT8 \
--input_arrays=input \
--output_arrays=final_result \
--input_shapes=1,224,224,3 \inference_input_type=QUANTIZED_UNIT8 \
--mean_values=128 \
--std_values=128 \
--default_ranges_min=0 \
--quantize_weights=true \
--default_ranges_max=6
and it fails
F tensorflow/contrib/lite/toco/graph_transformations/quantize.cc:600] Check failed: is_rnn_state_array
What am I missing?
TOCO currently doesn't support conversion of RNNs and LSTMs to TFLite. This is on our roadmap and these docs describe existing op support.