Tensorflow 2.2.0 and Keras save model / load model problems - tensorflow

After adding a custom loss function as #tf.function to my keras DQN, keras models stopped loading (seems to save model, but cannot reload model). Documentation suggests this is really simple, but...
Various SO answers suggest that models trained using one Keras version cannot be loaded into other Keras version. So I uninstalled Keras 2.4.3 (from Anaconda env), to avoid any confusion, and trying to solely model and save/load using Tensorflow-keras.
So, now trying to save a Tensorflow-keras model and then load that model again, but will not re-load, various errors (below). Environment is Anaconda3 python3.8 (with Keras 2.4.3, then uninstalled this) and Tensorflow 2.2.0 (containing Keras 2.3.0-tf).
Is there some solution to simply save a model and then reload a model in tf 2.2.0 (with keras 2.3.0-tf)?
import tensorflow as tf
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Dense, LSTM, Masking, Input
from tensorflow.keras.optimizers import Adam
Then all tf.keras modelling and save / load should be done by Keras-2.3.0-tf, from within Tensorflow. Model save is done with:
agent.model.save(os.path.join(pathOUT, PAIR, 'models' + modelNum, modelFolder),
save_format='tf')
But generates deprecation warning during save:
2020-11-26 00:19:03.388858: W tensorflow/python/util/util.cc:329] Sets are not currently
considered sequences, but this may change in the future, so consider avoiding using them.
WARNING:tensorflow:From C:\..mypath......\lib\site-
packages\tensorflow\python\ops\resource_variable_ops.py:1813: calling
BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with
constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Saved an Intermediate model...
Then attempt to load model with:
model = load_model(LOAD_MODEL)
but generates error during loading:
TypeError: __init__() got an unexpected keyword argument 'reduction'
Again, is there some solution to simply save a model and then reload a model in tf 2.2.0 (with keras 2.3.0-tf)?
Full error:
Traceback (most recent call last):
File mypath, line 851, in <module>
agent = DQNAgent()
File mypath, line 266, in __init__
self.model = self.create_model()
File mypath, line 336, in create_model
model = load_model(LOAD_MODEL)
File mypath\lib\site-packages\tensorflow\python\keras\saving\save.py", line 190, in load_model
return saved_model_load.load(filepath, compile)
File mypath\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 116,
in load
model = tf_load.load_internal(path, loader_cls=KerasObjectLoader)
File mypath\lib\site-packages\tensorflow\python\saved_model\load.py", line 602, in
load_internal
loader = loader_cls(object_graph_proto,
File mypath\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 188,
in __init__
super(KerasObjectLoader, self).__init__(*args, **kwargs)
File mypath\lib\site-packages\tensorflow\python\saved_model\load.py", line 123, in __init__
self._load_all()
File mypath\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 209,
in _load_all
self._layer_nodes = self._load_layers()
File mypath\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 312,
in _load_layers
layers[node_id] = self._load_layer(proto.user_object, node_id)
File mypath\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 335,
in _load_layer
obj, setter = self._revive_from_config(proto.identifier, metadata, node_id)
File mypath\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 349,
in _revive_from_config
obj = self._revive_metric_from_config(metadata, node_id)
File mypath\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 441,
in _revive_metric_from_config
obj = metrics.deserialize(
File mypath\lib\site-packages\tensorflow\python\keras\metrics.py", line 3345, in deserialize
return deserialize_keras_object(
File mypath\lib\site-packages\tensorflow\python\keras\utils\generic_utils.py", line 361, in
deserialize_keras_object
(cls, cls_config) = class_and_config_for_serialized_keras_object(
File mypath\lib\site-packages\tensorflow\python\keras\utils\generic_utils.py", line 327, in
class_and_config_for_serialized_keras_object
deserialized_objects[key] = deserialize_keras_object(
File mypath\lib\site-packages\tensorflow\python\keras\utils\generic_utils.py", line 375, in
deserialize_keras_object
return cls.from_config(cls_config)
File mypath\lib\site-packages\tensorflow\python\keras\metrics.py", line 628, in from_config
return super(MeanMetricWrapper, cls).from_config(config)
File mypath\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 655, in
from_config
return cls(**config)
TypeError: __init__() got an unexpected keyword argument 'reduction'
Additional code:
Custom loss function (trying to implement gradient ascent, by 'flipping' the error gradient), have tried this in various locations (within same agent class as model, outside agent class, other.
#tf.function
def positive_mse(y_true, y_pred):
return -1 * tf.keras.losses.MSE(y_true, y_pred)
The Sequential model I am using is here

Have resolved/bypassed the original keyword argument 'reduction' error by REMOVING the MeanSquareError() metric during model compile of original model. Original model:
model.compile(loss=positive_mse,
optimizer=Adam(lr=LEARNING_RATE, decay=DECAY),
metrics=[tf.keras.losses.MeanSquaredError()])
From the Keras docs: "Note that this is an important difference between loss functions like tf.keras.losses.mean_squared_error and default loss class instances like tf.keras.losses.MeanSquaredError: the function version does not perform reduction, but by default the class instance does."
The MeanSquaaredError loss class function is passing a 'reduction' keyword during evaluation of loss over a minibatch. Removing this metric allows model to be reloaded without error.

Related

How can I run mobiledet model successfully with the pretrained model in TF1 model zoo from TensorFlow object detection api?

I want to test the mobiledet model provided in the TF1 model zoo from TensorFlow object detection api. tf1 object detection model zoo
since the pretrained files contain both the pb file and the ckpt files the Screenshot of ckpt files.
So, I have tried two methods to load the pretrained model to do inference.
Firstly, I tried to load the tflite_graph.pb directly.I encountered the following problem, I tried to change the tf version, but it still did not solve.
The code is like this:
MODEL_DIR = '/tf_ckpts/ssdlite_mobiledet_cpu_320x320_coco_2020_05_19/'
MODEL_CHECK_FILE = os.path.join(MODEL_DIR, 'tflite_graph.pb')
graph = tf.Graph()
with graph.as_default():
graph_def = tf.GraphDef()
with tf.gfile.Open(MODEL_CHECK_FILE,'rb') as f:
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, name='')
Traceback (most recent call last):
File "/home/zhaoxin/workspace/models-1.12.0/research/inference_demo.py", line 41, in <module>
tf.import_graph_def(graph_def, name='')
File "/home/zhaoxin/tools/miniconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/zhaoxin/tools/miniconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/framework/importer.py", line 405, in import_graph_def
producer_op_list=producer_op_list)
File "/home/zhaoxin/tools/miniconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/framework/importer.py", line 505, in _import_graph_def_internal
raise ValueError(str(e))
ValueError: NodeDef mentions attr 'exponential_avg_factor' not in Op<name=FusedBatchNormV3; signature=x:T, scale:U, offset:U, mean:U, variance:U -> y:T, batch_mean:U, batch_variance:U, reserve_space_1:U, reserve_space_2:U, reserve_space_3:U; attr=T:type,allowed=[DT_HALF, DT_BFLOAT16, DT_FLOAT]; attr=U:type,allowed=[DT_FLOAT]; attr=epsilon:float,default=0.0001; attr=data_format:string,default="NHWC",allowed=["NHWC", "NCHW"]; attr=is_training:bool,default=true>; NodeDef: {{node FeatureExtractor/MobileDetCPU/Conv/BatchNorm/FusedBatchNormV3}}. (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).
Then, I tried to load the ckpt files to run the model.
mobiledet = 'tf_ckpts/ssdlite_mobiledet_cpu_320x320_coco_2020_05_19/'
meta_path = mobiledet+'model.ckpt-400000.meta'
ckpt_path = mobiledet+'model.ckpt-400000'
with tf.Session() as sess:
saver=tf.train.import_meta_graph(meta_path)
saver.restore(sess, ckpt_path)
graph = tf.get_default_graph()
The error like this:
Traceback (most recent call last):
File "/home/zhaoxin/workspace/models-1.12.0/research/tf_load.py", line 15, in <module>
saver=tf.train.import_meta_graph(meta_path)
File "/home/zhaoxin/tools/miniconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/training/saver.py", line 1453, in import_meta_graph
**kwargs)[0]
File "/home/zhaoxin/tools/miniconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/training/saver.py", line 1477, in _import_meta_graph_with_return_elements
**kwargs))
File "/home/zhaoxin/tools/miniconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/framework/meta_graph.py", line 809, in import_scoped_meta_graph_with_return_elements
return_elements=return_elements)
File "/home/zhaoxin/tools/miniconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/zhaoxin/tools/miniconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/framework/importer.py", line 405, in import_graph_def
producer_op_list=producer_op_list)
File "/home/zhaoxin/tools/miniconda3/envs/tf115/lib/python3.6/site-packages/tensorflow_core/python/framework/importer.py", line 501, in _import_graph_def_internal
graph._c_graph, serialized, options) # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.NotFoundError: Op type not registered 'LegacyParallelInterleaveDatasetV2' in binary running on localhost.localdomain. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
It seems that the loading errors of the above two methds are caused by the inconsistency of the tf version, but I have tried many tf versions and failed to solve it. Has anyone successfully run the mobiledet model in TF1 object detection model zoo?
OS: linux
TF version: tf 1.15
#Shane Zhao - are you planning on training with custom dataset or are you using the pretrained graph as is? The version of Tensorflow should only matter during training to the best of my knowledge. Anyways please refer this demo from Google in Colab - https://colab.research.google.com/github/luxonis/depthai-ml-training/blob/master/colab-notebooks/Easy_Object_Detection_Demo_Training.ipynb#scrollTo=JDddx2rPfex9

How to save a TensorFlow Hub model in SavedModels format?

I'd like to load a model from TensorFlow Hub and save it to disk. I tried:
import tensorflow as tf
import tensorflow_hub as hub
def save_module(url, save_path):
with tf.Graph().as_default():
module = hub.load(url)
tf.saved_model.save(module, save_path)
save_module("https://tfhub.dev/google/universal-sentence-encoder/4", "./saved-module")
But this fails with:
Traceback (most recent call last):
File "C:\project\python-env\lib\site-packages\tensorflow\python\client\session.py", line 1365, in _do_call
return fn(*args)
File "C:\project\python-env\lib\site-packages\tensorflow\python\client\session.py", line 1349, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "C:\project\python-env\lib\site-packages\tensorflow\python\client\session.py", line 1441, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.FailedPreconditionError: 2 root error(s) found.
(0) Failed precondition: Error while reading resource variable EncoderDNN/DNN/ResidualHidden_2/dense/kernel/part_27 from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/EncoderDNN/DNN/ResidualHidden_2/dense/kernel/part_27)
[[{{node EncoderDNN/DNN/ResidualHidden_2/dense/kernel/part_27/Read/ReadVariableOp}}]]
[[EncoderDNN/DNN/ResidualHidden_3/dense/kernel/part_22/Read/ReadVariableOp/_287]]
(1) Failed precondition: Error while reading resource variable EncoderDNN/DNN/ResidualHidden_2/dense/kernel/part_27 from Container: localhost. This could mean that the variable was uninitialized. Not found: Container localhost does not exist. (Could not find resource: localhost/EncoderDNN/DNN/ResidualHidden_2/dense/kernel/part_27)
[[{{node EncoderDNN/DNN/ResidualHidden_2/dense/kernel/part_27/Read/ReadVariableOp}}]]
0 successful operations.
0 derived errors ignored.
The answer must use the TensorFlow 2 API. Ideally, I want to accomplish this without Keras but I'll also accept answers that use it. Any ideas?
I couldn't get this working without Keras, but in any case this works:
import tensorflow as tf
import tensorflow_hub as hub
def save_module(url, save_path):
module = hub.KerasLayer(url)
model = tf.keras.Sequential(module)
tf.saved_model.save(model, save_path)
save_module("https://tfhub.dev/google/universal-sentence-encoder/4", "./saved-module")

How to convert the body-pix models for tfjs to keras h5 or tensorflow frozen graph

I'm porting body-pix to Python and C++ and want to export the body-pix pre-trained model for tensorflow.js into a tensorflow frozen graph. Is it possible?
I've already download the following files and tried to convert using tensorflowjs_converter, but it didn't work.
https://storage.googleapis.com/tfjs-models/savedmodel/posenet_mobilenet_025_partmap/model.json
https://storage.googleapis.com/tfjs-models/savedmodel/posenet_mobilenet_025_partmap/group1-shard1of1
The result is here.
$ tensorflowjs_converter --input_format tfjs_layers_model --output_format keras posenet_mobilenet_025_partmap/model.json test.h5
Traceback (most recent call last):
File "/home/xxx/anaconda3/envs/tfjs_test2/bin/tensorflowjs_converter", line 10, in <module>
sys.exit(main())
File "/home/xxx/anaconda3/envs/tfjs_test2/lib/python3.6/site-packages/tensorflowjs/converters/converter.py", line 368, in main
FLAGS.output_path)
File "/home/xxx/anaconda3/envs/tfjs_test2/lib/python3.6/site-packages/tensorflowjs/converters/converter.py", line 169, in dispatch_tensorflowjs_to_keras_h5_conversion
model = keras_tfjs_loader.load_keras_model(config_json_path)
File "/home/xxx/anaconda3/envs/tfjs_test2/lib/python3.6/site-packages/tensorflowjs/converters/keras_tfjs_loader.py", line 218, in load_keras_model
use_unique_name_scope=use_unique_name_scope)
File "/home/xxx/anaconda3/envs/tfjs_test2/lib/python3.6/site-packages/tensorflowjs/converters/keras_tfjs_loader.py", line 65, in _deserialize_keras_model
model = keras.models.model_from_json(json.dumps(model_topology_json))
File "/home/xxx/anaconda3/envs/tfjs_test2/lib/python3.6/site-packages/tensorflow/python/keras/saving/model_config.py", line 96, in model_from_json
return deserialize(config, custom_objects=custom_objects)
File "/home/xxx/anaconda3/envs/tfjs_test2/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py", line 81, in deserialize
layer_class_name = config['class_name']
KeyError: 'class_name'
The converter version is here.
tensorflowjs 1.0.1
Dependency versions:
keras 2.2.4-tf
tensorflow 2.0.0-dev20190405
On ubuntu 16.04 LTS and anaconda 3.
I've tried tensorflowjs 0.8.5, but it also didn't work.
It will be helpful if you tell me how to convert them. Either keras format or tensorflow frozen graph is OK. I think that both can be converted to each other.
Download the model.json file
Eg: https://storage.googleapis.com/tfjs-models/savedmodel/bodypix/resnet50/float/model-stride16.json
Download Corresponding weights from manifest.json
https://storage.googleapis.com/tfjs-models/savedmodel/bodypix/resnet50/float/manifest.json
Install tfjs_graph_converter
from https://github.com/ajaichemmanam/tfjs-to-tf
Convert model to .pb file
tfjs_graph_converter path/to/js/model path/to/frozen/model.pb
Here is an example of POSENET converted to keras h5 model. https://github.com/tensorflow/tfjs/files/3943875/posenet.zip
Same way you can use the bodypix models and convert it .

Trying to restore model, but tf.train.import_meta_graph(meta_path) raises error

I downloaded pretrained mobilenetV2 models from tensorflow models,and try to restore the graph,but got unexpected error.
Codes to reproduce the error is pretty concise:
import tensorflow as tf
meta_path = 'path/to/mobilenet_v2_0.35_224/mobilenet_v2_0.35_224.ckpt.meta'
sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True))
saver = tf.train.import_meta_graph(meta_path)
then the last line raises error:
Traceback (most recent call last):
File "/home/CVAR/study/codes/languages/python/pycharm/learn_tensorflow/train_mobileNet_v2/test_of_functions/saver_test.py", line 21, in <module>
saver = tf.train.import_meta_graph(meta_path)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1960, in import_meta_graph
**kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/meta_graph.py", line 744, in import_scoped_meta_graph
producer_op_list=producer_op_list)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 432, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 391, in import_graph_def
_RemoveDefaultAttrs(op_dict, producer_op_list, graph_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/importer.py", line 158, in _RemoveDefaultAttrs
op_def = op_dict[node.op]
KeyError: 'InfeedEnqueueTuple'
My system information is :
ubuntu 16.04
python 3.5
tensorflow-gpu 1.9
Any idea?
I recently also met such a problem. It seems like the reason is that the TensorFlow version you use to train the model is different from the version you use to read the graph description proto. What you need to do is to reinstall the TensorFlow to your training version. Otherwise, retraining the model would work.
FYI, the TensorFlow version I used to train is 1.12.0, by contrast, the version I use to load the graph is 1.13.1. Reinstallation solves the problem.
There are some ops not defined. from conv_blocks import * will fix this bug but I got another problem "ValueError: NodeDef expected inputs 'float, int32' do not match 1 inputs specified;". Still debugging, but hope this tip solves your problem.

TF object detection API - Compute evaluation measures failed

I successfully trained a model on my own dataset, exported the inference graph and did the inference on my test dataset.
I now have
the detections as tfrecord file, specified in input config
an eval_config file with the specified metrics set
When I try to compute the measures like in the new object detector inference and evaluation measure computation tutorial with
python object_detection/metrics/offline_eval_map_corloc.py --eval_dir=/media/sf_shared --eval_config_path=/media/sf_shared/eval_config.pbtxt --input_config_path=/media/sf_shared/input_config.pbtxt
It returns this AttributeError:
INFO:tensorflow:Processing file: /media/sf_shared/detections.record
INFO:tensorflow:Processed 0 images...
Traceback (most recent call last):
File "object_detection/metrics/offline_eval_map_corloc.py", line 173, in <module>
tf.app.run(main)
File "/home/chrza/anaconda2/envs/tf27/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/metrics/offline_eval_map_corloc.py", line 166, in main
metrics = read_data_and_evaluate(input_config, eval_config)
File "object_detection/metrics/offline_eval_map_corloc.py", line 124, in read_data_and_evaluate
decoded_dict)
File "/home/chrza/anaconda2/envs/tf27/lib/python2.7/site-packages/tensorflow/models/research/object_detection/utils/object_detection_evaluation.py", line 174, in add_single_ground_truth_image_info
(groundtruth_dict[standard_fields.InputDataFields.groundtruth_difficult]
AttributeError: 'NoneType' object has no attribute 'size'
Any hints?
I fixed it (temporarily) as follows:
if (standard_fields.InputDataFields.groundtruth_difficult in groundtruth_dict.keys()) and groundtruth_dict[standard_fields.InputDataFields.groundtruth_difficult]:
if groundtruth_dict[standard_fields.InputDataFields.groundtruth_difficult].size or not groundtruth_classes.size:
groundtruth_difficult = groundtruth_dict[standard_fields.InputDataFields.groundtruth_difficult]
In place of the existing lines (195-198) in
object_detection/metrutils/object_detection_evaluation.py
The error is caused due to the fact that, even in the case there is no difficulty flag passed, the size of the object is being checked for.
This is an error if you skipped that parameter in your tf records.
Perhaps this was the intent of the developers, but the clarity of documentation certainly leaves a lot to be desired for.