I am trying to create version under google cloud ml models for the successfully trained tensorflow estimator model. I believe that I am providing the correct Uri(in google storage) which includes saved_model.pb.
Framework: Tensorflow,
Framework Version: 1.13.1,
Runtime Version: 1.13,
Python: 3.5
Here is the traceback of the error:
Traceback (most recent call last):
File "/google/google-cloud-sdk/lib/googlecloudsdk/calliope/cli.py", line 985, in Execute
resources = calliope_command.Run(cli=self, args=args)
File "/google/google-cloud-sdk/lib/googlecloudsdk/calliope/backend.py", line 795, in Run
resources = command_instance.Run(args)
File "/google/google-cloud-sdk/lib/surface/ml_engine/versions/create.py", line 119, in Run
python_version=args.python_version)
File "/google/google-cloud-sdk/lib/googlecloudsdk/command_lib/ml_engine/versions_util.py", line 114, in Create
message='Creating version (this might take a few minutes)...')
File "/google/google-cloud-sdk/lib/googlecloudsdk/command_lib/ml_engine/versions_util.py", line 75, in WaitForOpMaybe
return operations_client.WaitForOperation(op, message=message).response
File "/google/google-cloud-sdk/lib/googlecloudsdk/api_lib/ml_engine/operations.py", line 114, in WaitForOperation
sleep_ms=5000)
File "/google/google-cloud-sdk/lib/googlecloudsdk/api_lib/util/waiter.py", line 264, in WaitFor
sleep_ms, _StatusUpdate)
File "/google/google-cloud-sdk/lib/googlecloudsdk/api_lib/util/waiter.py", line 326, in PollUntilDone
sleep_ms=sleep_ms)
File "/google/google-cloud-sdk/lib/googlecloudsdk/core/util/retry.py", line 229, in RetryOnResult
if not should_retry(result, state):
File "/google/google-cloud-sdk/lib/googlecloudsdk/api_lib/util/waiter.py", line 320, in _IsNotDone
return not poller.IsDone(operation)
File "/google/google-cloud-sdk/lib/googlecloudsdk/api_lib/util/waiter.py", line 122, in IsDone
raise OperationError(operation.error.message)
OperationError: Bad model detected with error: "Failed to load model: a bytes-like object is required, not 'str' (Error code: 0)"
ERROR: (gcloud.ml-engine.versions.create) Bad model detected with error: "Failed to load model: a bytes-like object is required, not 'str' (Error code: 0)"
Do you have any idea what might be the problem?
EDIT
I am using:
tf.estimator.LatestExporter('exporter', model.serving_input_fn)
as a estimator exporter.
serving_input_fn:
def serving_input_fn():
inputs = {'string1': tf.placeholder(tf.int16, [None, MAX_SEQUENCE_LENGTH]),
'string2': tf.placeholder(tf.int16, [None, MAX_SEQUENCE_LENGTH])}
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
PS: my model takes two inputs and returns one binary output.
Related
I'm new to tf object detection api 2.
After training the model you can run an evaluation process to check the accuracy of the model.
But when I tried to run I got the below error. I'm using the backbone as an efficientDet.
I was able to run the evaluation for scaling resolution 512 but 640 is failing with the below error.
This is the python file I called and ended up with the below error.
enter code here /tensorflow/models/research/object_detection/model_main_tf2.py
`enter code here`enter code here`Call arguments received:
• inputs=tf.Tensor(shape=(1, 480, 640, 3), dtype=float32)
• kwargs={'training': 'False'}
exception.
INFO:tensorflow:A replica probably exhausted all examples. Skipping pending examples on other replicas.
I0719 06:49:27.115007 140042699994880 model_lib_v2.py:943] A replica probably exhausted all examples. Skipping pending e
xamples on other replicas.
Traceback (most recent call last):
File "/home/pictcompute/effient_net_ve/tensorflow/models/research/object_detection/model_main_tf2.py", line 115, in <m
odule>
tf.compat.v1.app.run()
File "/home/pictcompute/effient_net_ve/lib/python3.8/site-packages/tensorflow/python/platform/app.py", line 36, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/pictcompute/effient_net_ve/lib/python3.8/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/pictcompute/effient_net_ve/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/home/pictcompute/effient_net_ve/tensorflow/models/research/object_detection/model_main_tf2.py", line 82, in mai
n
model_lib_v2.eval_continuously(
File "/home/pictcompute/effient_net_ve/lib/python3.8/site-packages/object_detection/model_lib_v2.py", line 1159, in ev
al_continuously
eager_eval_loop(
File "/home/pictcompute/effient_net_ve/lib/python3.8/site-packages/object_detection/model_lib_v2.py", line 1009, in ea
ger_eval_loop
for evaluator in evaluators:
TypeError: 'NoneType' object is not iterable
Highly appreciate your help.
Thanks
The error occurs when you try to iterate over a None value. For example
mylist = None
for x in mylist:
print(x)
TypeError Traceback (most recent call last)
<ipython-input-2-a63d8b17c4a7> in <module>
1 mylist = None
2
----> 3 for x in mylist:
4 print(x)
TypeError: 'NoneType' object is not iterable
The error can be avoided by checking if a value is None or not before iterating over it. Thank You.
I have some difficulty with tensorflow_datasets when I was trying to load mnist.
python:3.7
tensorflow : 2.1.0
tensorflow_datasets has been upgraded to latest version 4.6, because the default version of tensorflow_datasets from tensorflow installation has no attribute 'load'
But now the problem is data can not be downloaded and extracted successfully.
with the following command:
datasets = tfds.load(name="mnist")
the error message is :
Downloading and preparing dataset Unknown size (download: Unknown size, generated: Unknown size, total: Unknown size) to ~\tensorflow_datasets\mnist\3.0.1...
Extraction completed...: 0 file [00:00, ? file/s]██████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 138.37 url/s]
Dl Size...: 100%|██████████████████████████████████████████████████████████████████████████| 11594722/11594722 [00:00<00:00, 373172106.07 MiB/s]
Dl Completed...: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 122.03 url/s]
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\Wilso\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow_datasets\core\load.py", line 327, in load
dbuilder.download_and_prepare(**download_and_prepare_kwargs)
File "C:\Users\Wilso\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow_datasets\core\dataset_builder.py", line 483, in download_and_prepare
download_config=download_config,
File "C:\Users\Wilso\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow_datasets\core\dataset_builder.py", line 1222, in _download_and_prepare
disable_shuffling=self.info.disable_shuffling,
File "C:\Users\Wilso\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow_datasets\core\split_builder.py", line 310, in submit_split_generation
return self._build_from_generator(**build_kwargs)
File "C:\Users\Wilso\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow_datasets\core\split_builder.py", line 376, in _build_from_generator
leave=False,
File "C:\Users\Wilso\Anaconda3\envs\tfgpu\lib\site-packages\tqdm\std.py", line 1195, in iter
for obj in iterable:
File "C:\Users\Wilso\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow_datasets\image_classification\mnist.py", line 151, in _generate_examples
images = _extract_mnist_images(data_path, num_examples)
File "C:\Users\Wilso\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow_datasets\image_classification\mnist.py", line 350, in _extract_mnist_images
f.read(16) # header
File "C:\Users\Wilso\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow_core\python\lib\io\file_io.py", line 122, in read
self._preread_check()
File "C:\Users\Wilso\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow_core\python\lib\io\file_io.py", line 84, in _preread_check
compat.as_bytes(self.__name), 1024 * 512)
File "C:\Users\Wilso\Anaconda3\envs\tfgpu\lib\site-packages\tensorflow_core\python\util\compat.py", line 87, in as_bytes
(bytes_or_text,))
TypeError: Expected binary or unicode string, got WindowsGPath('C:\Users\Wilso\tensorflow_datasets\downloads\extracted\GZIP.cvdf-datasets_mnist_train-images-idx3-ubyteRA_Kv3PMVG-iFHXoHqNwJlYF9WviEKQCTSyo8gNSNgk.gz')
Try:
(ds_train, ds_test), ds_info = tfds.load(
"mnist",
split=["train", "test"],
shuffle_files=True,
as_supervised=True, # will return tuple (img, label) otherwise dict
with_info=True, # able to get info about dataset
)
I am using the keras API in tensorflow 2.4.0 to save at different steps my model using this line of code:
tf.keras.callbacks.ModelCheckpoint(os.path.join(log_dir, "checkpoint_{epoch:02d}.tf"),
save_freq=train_steps_per_epoch * cfg.log.checkpoint_save_every_epochs,
save_weights_only=False))
When the model is saved I encounter some warnings:
[2021-03-17 12:11:09,974][absl][WARNING] - Found untraced functions such as nl_0_layer_call_fn, nl_0_layer_call_and_return_conditional_losses, nl_1_layer_call_fn, nl_1_layer_call_and_return_conditional_losses, conv2d_layer_call_fn while saving (showing 5 of 280). These functions will not be directly callable after loading.
And when I load the model using this line of code:
model = tf.keras.models.load_model(os.path.join(train_dir, f'checkpoint_{cfg.training.checkpoint_epoch:02d}.tf'))
I have this error:
Traceback (most recent call last):
File "run/linear_classifier_evaluation.py", line 51, in run
model = tf.keras.models.load_model(os.path.join(train_dir, f'checkpoint_{cfg.training.checkpoint_epoch:02d}.tf'))
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/save.py", line 212, in load_model
return saved_model_load.load(filepath, compile, options)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saved_model/load.py", line 138, in load
keras_loader.load_layers(compile=compile)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saved_model/load.py", line 376, in load_layers
node_metadata.metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saved_model/load.py", line 417, in _load_layer
obj, setter = self._revive_from_config(identifier, metadata, node_id)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saved_model/load.py", line 434, in _revive_from_config
self._revive_graph_network(metadata, node_id) or
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/saved_model/load.py", line 471, in _revive_graph_network
inputs=[], outputs=[], name=config['name'])
KeyError: 'name'
I've searched online this error and I've read people suggesting to implement get_config and from_config methods in my model class but I am not using the H5 format so I shouldn't have to do that if I understood correctly the keras tutorial and others solutions where for tf1.x.
I would gladly welcome any help or suggestions on where to look.
I am trying to save the complete model using model.save (instead of only checkpoints) at the end of training steps while using official retinanet object detection API. However, I am getting the below error when model.save is called:
I0414 17:18:52.661234 140283524683584 distributed_executor.py:49] Saving model as TF checkpoint: /home/ubuntu/ankur/models/official/vision/detection/ModelDIr/ctl_step_5.ckpt-1
WARNING:tensorflow:From /home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
W0414 17:19:26.865248 140283524683584 deprecation.py:506] From /home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: /home/ubuntu/ankur/models/official/vision/detection/saved_model/test_model/assets
I0414 17:19:33.264274 140283524683584 builder_impl.py:775] Assets written to: /home/ubuntu/ankur/models/official/vision/detection/saved_model/test_model/assets
Traceback (most recent call last):
File "main.py", line 235, in <module>
app.run(main)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "main.py", line 230, in main
run()
File "main.py", line 224, in run
callbacks=callbacks)
File "main.py", line 117, in run_executor
save_config=True)
File "/home/ubuntu/ankur/models/official/modeling/training/distributed_executor.py", line 480, in train
model.save('/home/ubuntu/ankur/models/official/vision/detection/saved_model/test_model')
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/network.py", line 1008, in save
signatures, options)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/keras/saving/save.py", line 115, in save_model
signatures, options)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/keras/saving/saved_model/save.py", line 78, in save
save_lib.save(model, filepath, signatures, options)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/saved_model/save.py", line 923, in save
saveable_view, asset_info.asset_index)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/saved_model/save.py", line 647, in _serialize_object_graph
concrete_function, saveable_view.captured_tensor_node_ids, coder)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/saved_model/function_serialization.py", line 70, in serialize_concrete_function
coder.encode_structure(structured_outputs))
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/saved_model/nested_structure_coder.py", line 95, in encode_structure
return self._map_structure(nested_structure, self._get_encoders())
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/saved_model/nested_structure_coder.py", line 79, in _map_structure
return do(pyobj, recursion_fn)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/saved_model/nested_structure_coder.py", line 203, in do_encode
encoded_dict.dict_value.fields[key].CopyFrom(encode_fn(value))
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/saved_model/nested_structure_coder.py", line 79, in _map_structure
return do(pyobj, recursion_fn)
File "/home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/tensorflow_core/python/saved_model/nested_structure_coder.py", line 203, in do_encode
encoded_dict.dict_value.fields[key].CopyFrom(encode_fn(value))
TypeError: 3 has type int, but expected one of: bytes, unicode
Only change that I have made in order to save the complete model is to add the command model.save('/home/ubuntu/ankur/models/official/vision/detection/saved_model/test_model') at the end of training in distributed_executor.py found at the path ~/models/official/modeling/training. Kindly suggest what can be the issue.
TensorFlow version- 2.1
GIThub link
I've tried all steps of object_detection model installation mentioned #
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md
While testing the installation as mentioned in last step of article, I am getting below error.
ERROR: test_create_ssd_mobilenet_v1_model_from_config (__main__.ModelBuilderTest)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/RonakBhavsar/eDAM/ML/ObjectRecognition/models/object_detection/builders/model_builder_test.py", line 193, in test_create_ssd_mobilenet_v1_model_from_config
model = self.create_model(model_proto)
File "/Users/RonakBhavsar/eDAM/ML/ObjectRecognition/models/object_detection/builders/model_builder_test.py", line 53, in create_model
return model_builder.build(model_config, is_training=False)
File "/Users/RonakBhavsar/eDAM/ML/ObjectRecognition/models/object_detection/builders/model_builder.py", line 73, in build
return _build_ssd_model(model_config.ssd, is_training)
File "/Users/RonakBhavsar/eDAM/ML/ObjectRecognition/models/object_detection/builders/model_builder.py", line 126, in _build_ssd_model
is_training)
File "/Users/RonakBhavsar/eDAM/ML/ObjectRecognition/models/object_detection/builders/model_builder.py", line 98, in _build_ssd_feature_extractor
feature_extractor_config.conv_hyperparams, is_training)
File "/Users/RonakBhavsar/eDAM/ML/ObjectRecognition/models/object_detection/builders/hyperparams_builder.py", line 70, in build
hyperparams_config.regularizer),
File "/Users/RonakBhavsar/eDAM/ML/ObjectRecognition/models/object_detection/builders/hyperparams_builder.py", line 119, in _build_regularizer
return slim.l2_regularizer(scale=regularizer.l2_regularizer.weight)
File "/Users/RonakBhavsar/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/regularizers.py", line 92, in l2_regularizer
raise ValueError('scale cannot be an integer: %s' % (scale,))
ValueError: scale cannot be an integer: 1
I get this error for all the models mentioned in the test script. Anyone has any ideas?
We have a pull request out that should fix this issue. Please give that a try.