tfRecords shown faulty in TF2 - tensorflow2.0

I have a couple of own tfrecord file made by myself.
They are working perfectly in tf1, I used them in several projects.
However if i want to use them in Tensorflow Object Detection API with tf2 (running the model_main_tf2.py script), I see the following in tensorboard:
tensorboard images tab
It totally masses up the images.
(Running the /work/tfapi/research/object_detection/model_main.py script or even legacy_train and they looks fine)
Is tf2 using different kind of encoding in tfrecords?
Or what can cause such results?

Related

Freeze Saved_Model.pb created from converted Keras H5 model

I am currently trying to train a custom model for use in Unity (Barracuda) for object detection and I am struggling near what I believe to be the last part of the pipeline. Following various tutorials and git-repos I have done the following...
Using Darknet, I have trained a custom-model using the Tiny-Yolov2 model. (model tested successfully on a webcam python script)
I have taken the final weights from that training and converted them
to a Keras (h5) file. (model tested successfully on a webcam python
script)
From Keras, I then use tf.save_model to turn it into a
save_model.pd.
From save_model.pd I then convert it using tf2onnx.convert to change
it to an onnx file.
Supposedly from there it can then work in one of a few Unity sample
projects...
...however, this project fails to read in the Unity Sample projects I've tried to use. From various posts it seems that I may need to use a 'frozen' save_model.pd before converting it to ONNX. However all the guides and python functions that seem to be used for freezing save_models require a lot more arguments than I have awareness of or data for after going through so many systems. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py - for example, after converting into Keras, I only get left with a h5 file, with no knowledge of what an input_graph_def, or output_node_names might refer to.
Additionally, for whatever reason, I cannot find any TF version (1 or 2) that can successfully run this python script using 'from tensorflow.python.checkpoint import checkpoint_management' it genuinely seems like it not longer exists.
I am not sure why I am going through all of these conversions and steps but every attempt to find a cleaner process between training and unity seemed to lead only to dead ends.
Any help or guidance on this topic would be sincerely appreciated, thank you.

Can't manage to open TensorFlow SavedModel for usage in Keras

I'm kinda new to TensorFlow and Keras, so please excuse any accidental stupidity, but I have an issue. I've been trying to load in models from the TensorFlow Detection Zoo, but haven't had much success.
I can't figure out how to read these saved_model folders (they contain a saved_model.pb file, and an assets and variables folder), so that they're accepted by Keras. Nor can I figure out a way to convert these models so that they may be loaded in. I've tried converting the SavedModel to ONNX, and then convert the ONNX-model to Keras, but that didn't work. Trying to load the original model as a saved_model, and then trying to to save this loaded model in another format gave me no success either.
Since you are new to Tensorflow (and I guess deep learning) I would suggest you stick with the API because the detection zoo models best interface with the object detection API. If you have already downloaded the model, you just need to export it using the exporter_main_v2.py script. This article explains it very well link.

How can I convert the model I trained with Tensorflow (python) for use with TensorflowJS without involving IBM cloud (from the step I'm at now)?

What I'm trying to do
I'm trying to learn TensorFlow object recognition and as usual with new things, I scoured the web for tutorials. I don't want to involve any third party cloud service or web development framework, I want to learn to do it with just native JavaScript, Python, and the TensorFlow library.
What I have so far
So far, I've followed a TensorFlow object detection tutorial (accompanied by a 5+ hour video) to the point where I've trained a model in Tensorflow (python) and want to convert it to run in a browser via TensorflowJS. I've also tried other tutorials and haven't seemed to find one that explains how to do this without a third party cloud / tool and React.
I know in order to use this model with tensorflow.js my goal is to get files like:
group1-shard1of2.bin
group1-shard2of2.bin
labels.json
model.json
I've gotten to the point where I created my TFRecord files and started training:
py Tensorflow\models\research\object_detection\model_main_tf2.py --model_dir=Tensorflow\workspace\models\my_ssd_mobnet --pipeline_config_path=Tensorflow\workspace\models\my_ssd_mobnet\pipeline.config --num_train_steps=100
It seems after training the model, I'm left with:
files named checkpoint, ckpt-1.data-00000-of-00001, ckpt-1.index, pipeline.config
the pre-trained model (which I believe isn't the file that changes during training, right?) ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8
I'm sure it's not hard to get from this step to the files I need, but I honestly browsed a lot of documentation and tutorials and google and didn't see an example of doing it without some third party cloud service. Maybe it's in the documentation, I'm missing something obvious.
The project directory structure looks like this:
Where I've looked for an answer
For some reason, frustratingly, every single tutorial I've found (including the one linked above) for using a pre-trained Tensorflow model for object detection via TensorFlowJS has required the use of IBM Cloud and ReactJS. Maybe they're all copying from some tutorial they found and now all the tutorials include this, I don't know. What I do know is I'm building an Electron.js desktop app and object detection shouldn't require network connectivity assuming the compute is happening on the user's device. To clarify: I'm creating an app where the user trains the model, so it's not just a matter of one time conversion. I want to be able to train with Python Tensorflow and convert the model to run on JavaScript Tensorflow without any cloud API.
So I stopped looking for tutorials and tried looking directly at the documentation at https://github.com/tensorflow/tfjs.
When you get to the section about importing pre-trained models, it says:
Importing pre-trained models
We support porting pre-trained models from:
TensorFlow SavedModel
Keras
So I followed that link to Tensorflow SavedModel, which brings us to a project called tfjs-converter. That repo says:
This repository has been archived in favor of tensorflow/tfjs.
This repo will remain around for some time to keep history but all
future PRs should be sent to tensorflow/tfjs inside the tfjs-core
folder.
All history and contributions have been preserved in the monorepo.
Which sounds a bit like a circular reference to me, considering it's directing me to the page that just told me to go here. So at this point you're wondering well is this whole library deprecated, will it work or what? I look around in this repo anyway, into: https://github.com/tensorflow/tfjs-converter/tree/master/tfjs-converter
It says:
A 2-step process to import your model:
A python pip package to convert a TensorFlow SavedModel or TensorFlow Hub module to a web friendly format. If you already have a converted model, or are using an already hosted model (e.g. MobileNet), skip this step.
JavaScript API, for loading and running inference.
And basically says to create a venv and do:
pip install tensorflowjs
tensorflowjs_converter \
--input_format=tf_saved_model \
--output_format=tfjs_graph_model \
--signature_name=serving_default \
--saved_model_tags=serve \
/mobilenet/saved_model \
/mobilenet/web_model
But wait, are the checkpoint files I have a "TensorFlow SavedModel"? This doesn't seem clear, the documentation doesn't explain. So I google it, find the documentation, and it says:
You can save and load a model in the SavedModel format using the
following APIs:
Low-level tf.saved_model API. This document describes how to use this
API in detail. Save: tf.saved_model.save(model, path_to_dir)
The linked syntax extrapolates somewhat:
tf.saved_model.save(
obj, export_dir, signatures=None, options=None
)
with an example:
class Adder(tf.Module):
#tf.function(input_signature=[tf.TensorSpec(shape=[], dtype=tf.float32)])
def add(self, x):
return x + x
model = Adder()
tf.saved_model.save(model, '/tmp/adder')
But so far, this isn't familiar at all. I don't understand how to take the results of my training process so far (the checkpoints) to load it into a variable model so I can pass it to this function.
This passage seems important:
Variables must be tracked by assigning them to an attribute of a
tracked object or to an attribute of obj directly. TensorFlow objects
(e.g. layers from tf.keras.layers, optimizers from tf.train) track
their variables automatically. This is the same tracking scheme that
tf.train.Checkpoint uses, and an exported Checkpoint object may be
restored as a training checkpoint by pointing
tf.train.Checkpoint.restore to the SavedModel's "variables/"
subdirectory.
And it might be the answer, but I'm not really clear on what it means as far as being "restored", or where I go from there, if that's even the right step to take. All of this is very confusing to someone learning TF which is why I looked for a tutorial that does it, but again, I can't seem to find one without third party cloud services / React.
Please help me connect the dots.
You can convert your model to TensorFlowJS format without any cloud services. I have laid out the steps below.
I'm sure it's not hard to get from this step to the files I need.
The checkpoints you see are in tf.train.Checkpoint format (relevant source code that creates these checkpoints in the object detection model code). This is different from the SavedModel and Keras formats.
We will go through these steps:
Checkpoint (current) --> SavedModel --> TensorFlowJS
Converting from tf.train.Checkpoint to SavedModel
Please see the script models/research/object_detection/export_inference_graph.py to convert the Checkpoint files to SavedModel.
The code below is taken from the docs of that script. Please adjust the paths to your specific project. --input_type should remain as image_tensor.
python export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path path/to/ssd_inception_v2.config \
--trained_checkpoint_prefix path/to/model.ckpt \
--output_directory path/to/exported_model_directory
In the output directory, you should see a savedmodel directory. We will use this in the next step.
Converting SavedModel to TensorFlowJS
Follow the instructions at https://github.com/tensorflow/tfjs/tree/master/tfjs-converter, specifically paying attention to the "TensorFlow SavedModel example". The example conversion code is copied below. Please modify the input and output paths for your project. The --signature_name and --saved_model_tags might have to be changed, but hopefully not.
tensorflowjs_converter \
--input_format=tf_saved_model \
--output_format=tfjs_graph_model \
--signature_name=serving_default \
--saved_model_tags=serve \
/mobilenet/saved_model \
/mobilenet/web_model
Using the TensorFlowJS model
I know in order to use this model with tensorflow.js my goal is to get files like:
group1-shard1of2.bin
group1-shard2of2.bin
labels.json
model.json
The steps above should create these files for you, though I don't think labels.json will be created. I am not sure what that file should contain. TensorFlowJS will use model.json to construct the inference graph, and it will load the weights from the .bin files.
Because we converted a TensorFlow SavedModel to a TensorFlowJS model, we will need to load the JS model with tf.loadGraphModel(). See the tfjs converter page for more information.
Note that for TensorFlowJS, there is a difference between a TensorFlow SavedModel and a Keras SavedModel. Here, we are dealing with a TensorFlow SavedModel.
The Javascript code to run the model is probably out of scope for this answer, but I would recommend reading this TensorFlowJS tutorial. I have included a representative javascript portion below.
import * as tf from '#tensorflow/tfjs';
import {loadGraphModel} from '#tensorflow/tfjs-converter';
const MODEL_URL = 'model_directory/model.json';
const model = await loadGraphModel(MODEL_URL);
const cat = document.getElementById('cat');
model.execute(tf.browser.fromPixels(cat));
Extra notes
... Which sounds a bit like a circular reference to me,
The TensorFlowJS ecosystem has been consolidated in the tensorflow/tfjs GitHub repository. The tfjs-converter documentation lives there now. You can create a pull request to https://github.com/tensorflow/tfjs to fix the SavedModel link to point to the tensorflow/tfjs repository.

Workflow for converting .pb files to .tflite

Overview
I know this subject has been discussed many times, but I am having a hard time understanding the workflow, or rather, the variations of the workflow.
For example, imagine you are installing TensorFlow on Windows 10. The main goal being to train a custom model, convert to TensorFlow Lite, and copy the converted .tflite file to a Raspberry Pi running TensorFlow Lite.
The confusion for me starts with the conversion process. After following along with multiple guides, it seems TensorFlow is often install with pip, or Anaconda. But then I see detailed tutorials which indicate it needs to be built from source in order to convert from TensorFlow models to TFLite models.
To make things more interesting, I've also seen models which are converted via Python scripts as seen here.
Question
So far I have seen 3 ways to do this conversion, and it could just be that I don't have a grasp on the full picture. Below are the abbreviated methods I have seen:
Build from source, and use the TensorFlow Lite Optimizing Converter (TOCO):
bazel run --config=opt tensorflow/lite/toco:toco -- --input_file=$OUTPUT_DIR/tflite_graph.pb --output_file=$OUTPUT_DIR/detect.tflite ...
Use the TensorFlow Lite Converter Python API:
converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
tflite_model = converter.convert()
with tf.io.gfile.GFile('model.tflite', 'wb') as f:
f.write(tflite_model)
Use the tflite_convert CLI utilities:
tflite_convert --saved_model_dir=/tmp/mobilenet_saved_model --output_file=/tmp/mobilenet.tflite
I *think I understand that options 2/3 are the same, in the sense that the tflite_convert utility is installed, and can be invoked either from the command line, or through a Python script. But is there a specific reason you should choose one over the other?
And lastly, what really gets me confused is option 1. And maybe it's a version thing (1.x vs 2.x)? But what's the difference between the TensorFlow Lite Optimizing Converter (TOCO) and the TensorFlow Lite Converter. It appears that in order to use TOCO you would have to build TensorFlow from source, so is there is a reason you would use one over the other?
There is no difference in the output from different conversion methods, as long as the parameters remain the same. The Python API is better if you want to generate TFLite models in an automated way (for eg a Python script that's run periodically).
The TensorFlow Lite Optimizing Converter (TOCO) was the first version of the TF->TFLite converter. It was recently deprecated and replaced with a new converter that can handle more ops/models. So I wouldn't recommend using toco:toco via bazel, but rather use tflite_convert as mentioned here.
You should never have to build the converter from source, unless you are making some changes to it and want to test them out.

Using retrained data with classify_image.py

I've been using tensorflow image recognition. I've build many scripts which interact with classify_image.py.
I also retrained the model using retrain.py, with my own dataset.
How can use the two files generated: output_graph.pb, output_labels.txt with classify_image.py ?
Ah, the docs say
If you'd like to use the retrained model in a Python program this example from #eldor4do shows what you'll need to do.
Just copied/edited that one file locally, and python .\edited-retraining-example.py. And that was easy.
Note that if you're on Windows, change all examples of /tmp/... to c:/tmp/....