_DecodeError('Unexpected end-group tag.') when running import_pb_to_tensorboard.py - tensorflow

I am trying out ways to deploy tensorflow model on android/iOS devices. So I did:
1) use tf.saved_model.builder.SavedModelBuilder to get model in .pb file
2) use tf.saved_model.loader.load() to verify that I can restore the model
However, when I want to do further inspection of the model using import_pb_to_tensorboard.py following suggestions at
1) https://medium.com/#daj/how-to-inspect-a-pre-trained-tensorflow-model-5fd2ee79ced0
2) https://hackernoon.com/running-a-tensorflow-model-on-ios-and-android-ce89446c8143
I got this error:
File "/Users/rjtang/_hack/env.tensorflow_src/lib/python3.4/site-packages/google/protobuf/internal/python_message.py", line 1083, in MergeFromString
if self._InternalParse(serialized, 0, length) != length:
.....
File "/Users/rjtang/_hack/env.tensorflow_src/lib/python3.4/site-packages/google/protobuf/internal/decoder.py", line 612, in DecodeRepeatedField
if value.add()._InternalParse(buffer, pos, new_pos) != new_pos:
....
File "/Users/rjtang/_hack/env.tensorflow_src/lib/python3.4/site-packages/google/protobuf/internal/decoder.py", line 746, in DecodeMap
raise _DecodeError('Unexpected end-group tag.')
The code and the generated .pb files are here:
https://github.com/rjt10/hear_it/blob/master/urban_sound/saved_model.pb
https://github.com/rjt10/hear_it/blob/master/urban_sound/savedmodel_save.py
https://github.com/rjt10/hear_it/blob/master/urban_sound/savedmodel_load.py
The version of tensorflow that I use is built from source "HEAD detached at v1.4.1"

Well, I understand what's happening now. Tensorflow has at least 3 ways to save and load a model. The graph will be serialized as one of the following 3 protobuf objects:
GraphDef
MetaGraphDef
SavedModel
You just need to deserialize it properly, such as https://github.com/rjt10/hear_it/blob/master/urban_sound/model_check.py
For Android, TensorFlowInferenceInterface() expects a GraphDef, https://github.com/tensorflow/tensorflow/blob/e2be6d4c4fc9f1b7f6040b51b23190c14202e797/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java#L541
That explains why.

Related

how to import a tensorflow webmodel in python

I have a tensorflow "graph-model" consisting of a model.json and several .bin files. In javascript I am able to read those files using
const weights = browser.runtime.getURL("web_model/model.json");
tf.loadGraphModel(weights)
However I would like to be able to use this model in python, in order to process the results better.
When I try to load the model in python with
new_model = keras.models.load_model('./web_model/model.json')
I get the following error:
File "h5py/h5f.pyx", line 106, in h5py.h5f.open
OSError: Unable to open file (file signature not found)
I don't understand, since the javascript code is able to run the model, I think python should be able to do the same as well. What am I doing wrong ?

Template Matching through python API on Linux desktop

I'm following the tutorial on using your own template images to do object 3D pose tracking, but I'm trying to get it working on Ubuntu 20.04 with a live webcam stream.
I was able to successfully make my index .pb file with extracted KNIFT features from my custom images.
It seems the next thing to do is load the provided template matching graph (in mediapipe/graphs/template_matching/template_matching_desktop.pbtxt) (replacing the index_proto_filename of the BoxDetectorCalculator with my own index file), and run it on a video input stream to track my custom object.
I was hoping that would be easiest to do in python, but am running into dependency problems.
(I installed mediapipe python with pip3 install mediapipe)
First, I couldn't find how to directly load a .pbtxt file as a graph in the mediapipe python API, but that's ok. I just load the text it contains and use that.
template_matching_graph_filepath=os.path.abspath("~/mediapipe/mediapipe/graphs/template_matching/template_matching_desktop.pbtxt")
graph = mp.CalculatorGraph(graph_config=open(template_matching_graph_filepath).read())
But I get missing calculator targets.
No registered object with name: OpenCvVideoDecoderCalculator; Unable to find Calculator "OpenCvVideoDecoderCalculator"
or
[libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/text_format.cc:309] Error parsing text-format mediapipe.CalculatorGraphConfig: 54:70: Could not find type "type.googleapis.com/mediapipe.TfLiteInferenceCalculatorOptions" stored in google.protobuf.Any.
It seems similar to this troubleshooting case but, since I'm not trying to compile an application, I'm not sure how to link in the missing calculators.
How to I make the mediapipe python API aware of these graphs?
UPDATE:
I made decent progress by adding the graphs that the template_matching depends on to the cc_library deps of the mediapipe/python/BUILD file
cc_library(
name = "builtin_calculators",
deps = [
"//mediapipe/calculators/image:feature_detector_calculator",
"//mediapipe/calculators/image:image_properties_calculator",
"//mediapipe/calculators/video:opencv_video_decoder_calculator",
"//mediapipe/calculators/video:opencv_video_encoder_calculator",
"//mediapipe/calculators/video:box_detector_calculator",
"//mediapipe/calculators/tflite:tflite_inference_calculator",
"//mediapipe/calculators/tflite:tflite_tensors_to_floats_calculator",
"//mediapipe/calculators/util:timed_box_list_id_to_label_calculator",
"//mediapipe/calculators/util:timed_box_list_to_render_data_calculator",
"//mediapipe/calculators/util:landmarks_to_render_data_calculator",
"//mediapipe/calculators/util:annotation_overlay_calculator",
...
I also modified solution_base.py so it knows about BoxDetector's options.
from mediapipe.calculators.video import box_detector_calculator_pb2
...
CALCULATOR_TO_OPTIONS = {
'BoxDetectorCalculator':
box_detector_calculator_pb2
.BoxDetectorCalculatorOptions,
Then I rebuilt and installed mediapipe python from source with:
~/mediapipe$ python3 setup.py install --link-opencv
Then I was able to make my own class derived from SolutionBase
from mediapipe.python.solution_base import SolutionBase
class ObjectTracker(SolutionBase):
"""Process a video stream and output a video with edges of templates highlighted."""
def __init__(self,
object_knift_index_file_path):
super().__init__(binary_graph_path=object_pose_estimation_binary_file_path,
calculator_params={"BoxDetector.index_proto_filename": object_knift_index_file_path},
)
def process(self, image: np.ndarray) -> NamedTuple:
return super().process(input_data={'input_video':image})
ot = ObjectTracker(object_knift_index_file_path="/path/to/my/object_knift_index.pb")
Finally, I process a video frame from a cv2.VideoCapture
cv_video = cv2.VideoCapture(0)
result, frame = cv_video.read()
input_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
res = ot.process(image=input_frame)
So close! But I run into this error which I just don't know what to do with.
/usr/local/lib/python3.8/dist-packages/mediapipe/python/solution_base.py in process(self, input_data)
326 if data.shape[2] != RGB_CHANNELS:
327 raise ValueError('Input image must contain three channel rgb data.')
--> 328 self._graph.add_packet_to_input_stream(
329 stream=stream_name,
330 packet=self._make_packet(input_stream_type,
RuntimeError: Graph has errors:
Calculator::Open() for node "BoxDetector" failed: ; Error while reading file: /usr/local/lib/python3.8/dist-packages/
Looks like CalculatorNode::OpenNode() is trying to open the python API install path as a file. Maybe it has to do with the default_context. I have no idea where to go from here. :(

ValueError: Input 0 of node Variable/Assign was passed int32 from Variable:0 incompatible with expected int32_ref

I am currently trying to get a trained TF seq2seq model working with Tensorflow.js. I need to get the json files for this. My input is a few sentences and the output is "embeddings". This model is working when I read in the checkpoint however I can't get it converted for tf.js. Part of the process for conversion is to get my latest checkpoint frozen as a protobuf (pb) file and then convert that to the json formats expected by tensorflow.js.
The above is my understanding and being that I haven't done this before, it may be wrong so please feel free to correct if I'm wrong in what I have deduced from reading.
When I try to convert to the tensorflow.js format I use the following command:
sudo tensorflowjs_converter --input_format=tf_frozen_model
--output_node_names='embeddings'
--saved_model_tags=serve
./saved_model/model.pb /web_model
This then displays the error listed in this post:
ValueError: Input 0 of node Variable/Assign was passed int32 from
Variable:0 incompatible with expected int32_ref.
One of the problems I'm running into is that I'm really not even sure how to troubleshoot this. So I was hoping that perhaps one of you maybe had some guidance or maybe you know what my issue may be.
I have upped the code I used to convert the checkpoint file to protobuf at the link below. I then added to the bottom of the notebook an import of that file that is then providing the same error I get when trying to convert to tensorflowjs format. (Just scroll to the bottom of the notebook)
https://github.com/xtr33me/textsumToTfjs/blob/master/convert_ckpt_to_pb.ipynb
Any help would be greatly appreciated!
Still unsure as to why I was getting the above error, however in the end I was able to resolve this issue by just switching over to using TF's SavedModel via tf.saved_model. A rough example of what worked for me can be found below should anyone in the future run into something similar. After saving out the below model, I was then able to perform the tensorflowjs_convert call on it and export the correct files.
if first_iter == True: #first time through
first_iter = False
#Lets try saving this badboy
cwd = os.getcwd()
path = os.path.join(cwd, 'simple')
shutil.rmtree(path, ignore_errors=True)
inputs_dict = {
"batch_decoder_input": tf.convert_to_tensor(batch_decoder_input)
}
outputs_dict = {
"batch_decoder_output": tf.convert_to_tensor(batch_decoder_output)
}
tf.saved_model.simple_save(
sess, path, inputs_dict, outputs_dict
)
print('Model Saved')
#End save model code

Freeze graph error while preparing custom tensorflow mobile model

I am preparing a custom model to run on android phone using instructions from https://www.tensorflow.org/mobile/prepare_models
First i retrained the model on custom images using below command:
$ python tensorflow/examples/image_retraining/retrain.py --image_dir tensorflow/examples/image_retraining/my_images/ --learning_rate=0.0005 --testing_percentage=15 --validation_percentage=15 --train_batch_size=32 --validation_batch_size=-1 --flip_left_right True --random_scale=30 --random_brightness=30 --eval_step_interval=100 --how_many_training_steps=100 --tfhub_module https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/feature_vector/1
and as next step, I tested the model using label_image.py which also works fine in predicting the input image. However, freeze_graph gives error
$ bazel-bin/tensorflow/python/tools/freeze_graph --input_graph=/tmp/output_graph.pb --output_graph=/tmp/frozen_graph.pb
However, I keep getting this error.
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position
57: invalid start byte
I noticed that your --input_graph=/tmp/output_graph.pb. Is your graph written as binary file (as_text=False), instead of pbtxt? If so, you will need to pass the --input_binary=true flag to freeze_graph.
if you write your graph as a binary file using:
tf.train.write_graph(sess.graph_def, 'tarinGraph', 'train2.pbtxt', as_text=False)
then you will need to pass the --input_binary=true flag to freeze_graph.

Error while deserializing the Apache MXNet object

I have trained and saved a model using Amazon SageMaker which saves the model in the format of model.tar.gz which when untarred, has a file model_algo-1 which is a serialized Apache MXNet object. To load the model in memory I need to deserialize the model. I tried doing so as follows:
import mxnet as mx
print(mx.ndarray.load('model_algo-1'))
Reference taken from https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html
However, doing this yields me the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.4/site-packages/mxnet/ndarray/utils.py", line
175, in load
ctypes.byref(names)))
File "/usr/local/lib/python3.4/site-packages/mxnet/base.py", line 146, in
check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [19:06:25] src/ndarray/ndarray.cc:1112: Check failed:
header == kMXAPINDArrayListMagic Invalid NDArray file format
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python3.4/site-packages/mxnet/libmxnet.so(+0x192112)
[0x7fe432bfa112]
[bt] (1) /usr/local/lib/python3.4/site-packages/mxnet/libmxnet.so(+0x192738)
[0x7fe432bfa738]
[bt] (2) /usr/local/lib/python3.4/site-
packages/mxnet/libmxnet.so(+0x24a5c44) [0x7fe434f0dc44]
[bt] (3) /usr/local/lib/python3.4/site-
packages/mxnet/libmxnet.so(MXNDArrayLoad+0x248) [0x7fe434d19ad8]
[bt] (4) /usr/lib64/libffi.so.6(ffi_call_unix64+0x4c) [0x7fe48c5bbcec]
[bt] (5) /usr/lib64/libffi.so.6(ffi_call+0x1f5) [0x7fe48c5bb615]
[bt] (6) /usr/lib64/python3.4/lib-dynload/_ctypes.cpython-
34m.so(_ctypes_callproc+0x2fb) [0x7fe48c7ce18b]
[bt] (7) /usr/lib64/python3.4/lib-dynload/_ctypes.cpython-34m.so(+0xa4cf)
[0x7fe48c7c84cf]
[bt] (8) /usr/lib64/libpython3.4m.so.1.0(PyObject_Call+0x8c)
[0x7fe4942fcb5c]
[bt] (9) /usr/lib64/libpython3.4m.so.1.0(PyEval_EvalFrameEx+0x36c5)
[0x7fe4943ac915]
Could someone suggest how this can be resolved?
If your model is serialized into archive properly, then there should be at least 2 files:
model_name.json - it contains the architecture of your model
model_name.params - it contains parameters of your model
So, to load the model back, you need to:
Restore the model itself by loading json file.
Restore model parameters (you don't use mxnet nd.array for that, but full model).
Here is the code example how to do it:
# sym_json - content of .json file
net = gluon.nn.SymbolBlock(
outputs=mx.sym.load_json(sym_json),
inputs=mx.sym.var('data'))
# params_filename - full path to parameters file
net.load_params(params_filename)
If you also want to check serialization of your model as well, take a look to this example. This example shows how to serialize a trained model manually before uploading to SageMaker.
More details on serializing and deserializing model manually can be found here.
I trained a stock Linear Learner algorithm through AWS Sagemaker. It makes a model object called model.tar.gz in the output folder. As Vasanti noted, there is some notation that these objects are mxnet object in an article
I knew I had to unpack the tar, what I didn't realize was how many times. I started with this code:
import subprocess
cmdline = ['tar','-xzvf','model.tar.gz']
subprocess.call(cmdline)
which yields the file called 'model_algo-1' which led me to this page. However, it's still a packed file. So run:
cmdline = ['tar','-xzvf','model_algo-1']
subprocess.call(cmdline)
This yields:
additional-params.json
manifest.json
mx-mod-0000.params
mx-mod-symbol.json
From there, you can utilize Sergei's post:
# load the json file
import json
sym_json = json.load(open('mx-mod-symbol.json'))
sym_json_string = json.dumps(sym_json)
# open model
import mxnet as mx
from mxnet import gluon
net = gluon.nn.SymbolBlock(
outputs=mx.sym.load_json(sym_json_string),
inputs=mx.sym.var('data'))
# params file
net.load_parameters('mx-mod-0000.params', allow_missing=True)
Now, if only I knew what to do with this mxnet / gluon object to get what I really want which is a feature importance rank order and weight for some model explainability.