TensorBoard giving -1 evaluation values for model_main_tf2.py - tensorflow

I am getting this result after training on TensorFlow object detection API 2.x on a custom dataset.
TF2: SSD MobileNet v2 320x320
Can you please suggest a scirpt modification in model_main_tf2.py to get evaluation results exactly like model_main.py?

Related

Converting tensorflow2.0 model to TensorRT engine (tensorflow2.0)

I have retrained some tensorflow2.0 model, it's working as 1 class object detector, prepared with object detection api v2 (https://tensorflow-object-detection-api-tutorial.readthedocs.io/).
After that I have converted it to onnx (tf2onnx.convert) and tested - got the same inference results.
I have tested all pretrained models (downloaded from tf model zoo https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md):
ssd_mobilenet_v2_320x320_coco17_tpu-8
ssd_mobilenet_v1_fpn_640x640_coco17_tpu-8
ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8
ssd_resnet50_v1_fpn_640x640_coco17_tpu-8
I have retrained it by using some small batch of data.
The problem is with using it with gstreamer/deepstream. As I have seen, gstreamer consumes the onnx model, or model after converting it to TensorRT. (If I will provide onnx - model is also converted to TensorRT of course, but it's done by gstreamer right before running)
I was also trying to same pipeline with train->convert to onnx->convert to trt (or just provide onnx model to gstreamer). Same issue.
Error:
ERROR: [TRT]: [graph.cpp::computeInputExecutionUses::519] Error Code
9: Internal Error ((Unnamed Layer* 747) [Recurrence]: IRecurrenceLayer
cannot be used to compute a shape tensor)
TensorRT Version: 8.2.1.8
tf2onnx Version: 1.9.3
Is there any chance to get some help?
Or maybe I should skip the onnx model and just convert it from tensorflow to tensorRT engine? Is it possible?
Of course I can upload the model if it would help.
BR!

Using model optimizer for tensorflow slim models

I am aiming to inference tensorflow slim model with Intel OpenVINO optimizer. Using open vino docs and slides for inference and tf slim docs for training model.
It's a multi-class classification problem. I have trained tf slim mobilnet_v2 model from scratch (using sript train_image_classifier.py). Evaluation of trained model on test set gives relatively good results to begin with (using script eval_image_classifier.py):
eval/Accuracy[0.8017]eval/Recall_5[0.9993]
However, single .ckpt file is not saved (even though at the end of train_image_classifier.py run there is a message like "model.ckpt is saved to checkpoint_dir"), there are 3 files (.ckpt-180000.data-00000-of-00001, .ckpt-180000.index, .ckpt-180000.meta) instead.
OpenVINO model optimizer requires a single checkpoint file.
According to docs I call mo_tf.py with following params:
python mo_tf.py --input_model D:/model/mobilenet_v2_224.pb --input_checkpoint D:/model/model.ckpt-180000 -b 1
It gives the error (same if pass --input_checkpoint D:/model/model.ckpt):
[ ERROR ] The value for command line parameter "input_checkpoint" must be existing file/directory, but "D:/model/model.ckpt-180000" does not exist.
Error message is clear, there are not such files on disk. But as I know most tf utilities convert .ckpt-????.meta to .ckpt under the hood.
Trying to call:
python mo_tf.py --input_model D:/model/mobilenet_v2_224.pb --input_meta_graph D:/model/model.ckpt-180000.meta -b 1
Causes:
[ ERROR ] Unknown configuration of input model parameters
It doesn't matter for me in which way I will transfer graph to OpenVINO intermediate representation, just need to reach that result.
Thanks a lot.
EDIT
I managed to run OpenVINO model optimizer on frozen graph of tf slim model. However I still have no idea why had my previous attempts (based on docs) failed.
you can try converting the model to frozen format (.pb) and then convert the model using OpenVINO.
.ckpt-meta has the metagraph. The computation graph structure without variable values.
the one you can observe in tensorboard.
.ckpt-data has the variable values,without the skeleton or structure. to restore a model we need both meta and data files.
.pb file saves the whole graph (meta+data)
As per the documentation of OpenVINO:
When a network is defined in Python* code, you have to create an inference graph file. Usually, graphs are built in a form that allows model training. That means that all trainable parameters are represented as variables in the graph. To use the graph with the Model Optimizer, it should be frozen.
https://software.intel.com/en-us/articles/OpenVINO-Using-TensorFlow
the OpenVINO optimizes the model by converting the weighted graph passed in frozen form.

Is tensorflow lite model already quantized?

Does the converted tensorflow lite model always have quantized calculation and output?
Or it depends on the tensorflow model's input and inference type?
It depends on the inference type.
First the input model should be instrumented with quantization operations, https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/quantize can be helpful with that.
The resulting eval graph should be provided to TOCO for conversion with --inference_type=QUANTIZED_UINT8 and the correct --mean_values and --std_values for the input arrays.
See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/toco/g3doc/cmdline_examples.md for some examples of how to invoke TOCO for quantized models.
UPDATE: We have added a new post training quantization tool: https://medium.com/tensorflow/tensorflow-model-optimization-toolkit-post-training-integer-quantization-b4964a1ea9ba that should be easier than the old methods of quantization.

TensorFlow - How to Get My Loss Value from tf.Estimaor

I am trying to train an alexnet model using tf.estimator in TensorFlow. The training process works smoothly, and I can see the logs well displayed.
INFO:tensorflow:loss = 2.61362, step = 294
INFO:tensorflow:Saving checkpoints for 325 into /home/olu/Dev/data_base
sign_base/output/Checkpoints_N_Model/trained_alexnet_model/model.ckpt.
INFO:tensorflow:Loss for final step: 2.94104.
Below is how the training function is called:
traffic_sign_classifier.train(input_fn=train_input_fn,hooks=[logging_hook])
Please, how can I get my loss value as a normal python floating point from the tf estimator object
You can download the loss values from tensorboard when finishing your training.
The .evaluate() method on the estimator returns a dictionary of metrics you can specify in your model function, in the estimator spec, https://www.tensorflow.org/api_docs/python/tf/estimator/EstimatorSpec eval_metric_ops. I found the answer on this thread on GitHub Link

TF Object Detection API - Trouble running quantized network after freezing and quantizing my fine-tuned network

TensorFlow Object Detection API
Using the TensorFlow Object Detection API to retrain MobileNet on my own DataSet. The issue occurs as I try to run my inference graph that has been both frozen and quantized.
System:
Ubuntu 16.04,
TensorFlow 1.2 (from source, CPU only),
Bazel 0.4.5
Issue:
Use provided frozen_graph.pb from model zoo.
Quantize to 8-bit using
bazel-bin/tensorflow/tools/graph_transforms/transform_graph.
Run inference
This works, however,
Re-train and produce my own frozen_graph.pb using object_detection/export_inference_graph.py
Quantize to 8-bit using bazel-bin/tensorflow/tools/graph_transforms/transform_graph
Run inference <-- Produces error
Does NOT work, and the error I'm getting during the attempt to run the graph is:
File
"/home/unibap/TensorFlow/tensorflow-python2-sse4.2/local/lib/python2.7/site-packages/tensorflow/python/client/session.py",
line 1298, in _do_call
raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: The node
'Preprocessor/map/while/ResizeImage/ResizeBilinear/eightbit' has
inputs from different frames. The input
'Preprocessor/map/while/ResizeImage/size' is in frame
'Preprocessor/map/while/Preprocessor/map/while/'. The input
'Preprocessor/map/while/ResizeImage/ResizeBilinear_eightbit/Preprocessor/map/while/ResizeImage/ExpandDims/quantize'
is in frame ''.
Since I can quantize and run the provided frozen_graph.pb the issue has to be with the export tool? Which export tool was used to create the frozen_graph.pb that are in the model zoo? Or how was the export tool called?
PS:
Quote from comments in export_inference_graph.pb, assuring me that it should produce a frozen graph if checkpoint is provided.
"Optionally, one can freeze the graph by converting the weights in the provided
checkpoint as graph constants thereby eliminating the need to use a checkpoint
file during inference."
Best