I renamed the model dir from /home/abcd/andrew_model_jul25_tif/ which contained model and summary directories to /home/abcd/andrew_model_sep22/ which contained the same two folders. When I ran the python script it gave me the following error:
Traceback (most recent call last):
File "eval_on_full_image.py", line 127, in <module>
tf.app.run()
File "/home/abcd/virtualenvs/project/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "eval_on_full_image.py", line 119, in main
do_eval_on_whole(model_dir, file, file[a:], output_dir)
File "eval_on_full_image.py", line 51, in do_eval_on_whole
saver.restore(sess,tf.train.latest_checkpoint(model_dir))
File "/home/abcd/virtualenvs/project/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1602, in latest_checkpoint
if file_io.get_matching_files(v2_path) or file_io.get_matching_files(
File "/home/abcd/virtualenvs/project/local/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 334, in get_matching_files
compat.as_bytes(single_filename), status)
File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
self.gen.next()
File "/home/abcd/virtualenvs/project/local/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.NotFoundError: /home/abcd/andrew_model_jul25_tif/model
When I changed the folder's name back to andrew_model_jul25 the script worked. Can changing the folder's name have such an effect?
I'm using the 1.1.0 version of tf, and running it on a GPU.
The problem arises here:
saver.restore(sess,tf.train.latest_checkpoint(model_dir))
Try updating the name of your model_dir variable
Related
Can someone help me with this error. I tried uninstalling Tensorflow Tensorboard and reinstalled still I am facing the issue.
ERROR: Failed to launch TensorBoard (exited with 1).
Contents of stderr:
2021-05-29 16:11:25.794509: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll
Traceback (most recent call last):
File "c:\users\shara\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "c:\users\shara\anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\shara\anaconda3\Scripts\tensorboard.exe\__main__.py", line 7, in <module>
File "c:\users\shara\anaconda3\lib\site-packages\tensorboard\main.py", line 46, in run_main
app.run(tensorboard.main, flags_parser=tensorboard.configure)
File "c:\users\shara\anaconda3\lib\site-packages\absl\app.py", line 300, in run
_run_main(main, args)
File "c:\users\shara\anaconda3\lib\site-packages\absl\app.py", line 251, in _run_main
sys.exit(main(argv))
File "c:\users\shara\anaconda3\lib\site-packages\tensorboard\program.py", line 276, in main
return runner(self.flags) or 0
File "c:\users\shara\anaconda3\lib\site-packages\tensorboard\program.py", line 292, in _run_serve_subcommand
server = self._make_server()
File "c:\users\shara\anaconda3\lib\site-packages\tensorboard\program.py", line 472, in _make_server
deprecated_multiplexer,
File "c:\users\shara\anaconda3\lib\site-packages\tensorboard\backend\application.py", line 145, in TensorBoardWSGIApp
experimental_middlewares,
File "c:\users\shara\anaconda3\lib\site-packages\tensorboard\backend\application.py", line 253, in __init__
"Duplicate plugins for name %s" % plugin.plugin_name
ValueError: Duplicate plugins for name projector
It may be possible that you still have 2 versions of tensorboard installed.
Here is a script, which will check if there are some problems with your tensorboard and may give you some instructions on how to fix it:
https://raw.githubusercontent.com/tensorflow/tensorboard/master/tensorboard/tools/diagnose_tensorboard.py
I would give it a try.
Im setting up an odoo community server and after adding and removing an addon, i am getting an error.
I have tried copying the odoo addons files back to the default install ones with no success.
Exception in thread odoo.service.cron.cron0:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/opt/odoo/odoo/odoo/service/server.py", line 244, in target
self.cron_thread(i)
File "/opt/odoo/odoo/odoo/service/server.py", line 218, in cron_thread
from odoo.addons.base.ir.ir_cron import ir_cron
File "/opt/odoo/odoo/odoo/modules/module.py", line 82, in load_module
exec(open(modfile, 'rb').read(), new_mod.__dict__)
File "<string>", line 4, in <module>
File "/opt/odoo/odoo/odoo/addons/base/ir/__init__.py", line 8, in
<module>
from . import ir_actions
File "/opt/odoo/odoo/odoo/addons/base/ir/ir_actions.py", line 8, in
<module>
from odoo.tools import pycompat, wrap_module
ImportError: cannot import name 'wrap_module'
That is the error i am getting, ive tried using pip3 to install pycompat but it still dosnt work.
Make sure you running Odoo on python3,
on terminal python3 path_to_odoo/./odoo-bin
and clear the browser cache and try again.
Following the Running MNIST on Cloud TPU tutorial:
I get the following error when I try to train:
python /usr/share/models/official/mnist/mnist_tpu.py \
--tpu=$TPU_NAME \
--DATA_DIR=${STORAGE_BUCKET}/data \
--MODEL_DIR=${STORAGE_BUCKET}/output \
--use_tpu=True \
--iterations=500 \
--train_steps=2000
=>
alexryan#alex-tpu:~/tpu$ ./train-mnist.sh
W1025 20:21:39.351166 139745816463104 __init__.py:44] file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/discovery_cache/__init__.py", line 41, in autodetect
from . import file_cache
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 41, in <module>
'file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth')
ImportError: file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth
Traceback (most recent call last):
File "/usr/share/models/official/mnist/mnist_tpu.py", line 173, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/usr/share/models/official/mnist/mnist_tpu.py", line 152, in main
tpu_config=tf.contrib.tpu.TPUConfig(FLAGS.iterations, FLAGS.num_shards),
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tpu/python/tpu/tpu_config.py", line 207, in __init__
self._master = cluster.master()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/cluster_resolver/python/training/tpu_cluster_resolver.py", line 223, in master
job_tasks = self.cluster_spec().job_tasks(self._job_name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/cluster_resolver/python/training/tpu_cluster_resolver.py", line 269, in cluster_spec
(compat.as_text(self._tpu), response['health']))
RuntimeError: TPU "alex-tpu" is unhealthy: "TIMEOUT"
alexryan#alex-tpu:~/tpu$
The only places where I varied from the instructions were:
Instead of running ctpu in the cloud shell, I ran it on the mac.
>ctpu version
ctpu version: 1.7
The zone where the TPU resided was different than the default zone of my config, so I specified it as an option like so:
>cat ctpu-up.sh
ctpu up --zone us-central1-b --preemptible
I was able to move the MNIST files to the gcs bucket from the vm no problem:
alexryan#alex-tpu:~$ gsutil cp -r ./data ${STORAGE_BUCKET}
Copying file://./data/validation.tfrecords [Content-Type=application/octet-stream]...
Copying file://./data/train-images-idx3-ubyte.gz [Content-Type=application/octet-stream]...
I tried the (Optional) Set up TensorBoard >
Running cloud_tpu_profiler
Go to the Cloud Console > TPUs > and click on the TPU you created.
Locate the service account name for the Cloud TPU and copy it, for
example:
service-11111111118#cloud-tpu.iam.myserviceaccount.com
In the list of buckets, select the bucket you want to use, select Show
Info Panel, and then select Edit bucket permissions. Paste your
service account name into the add members field for that bucket and
select the following permissions:
"Cloud Console > TPUs" does not exist as an option
so I used the service account associate with the VM
"Cloud Console > Compute Engine > alex-tpu"
since the last error message was "RuntimeError: TPU "alex-tpu" is unhealthy: "TIMEOUT", I used ctpu to delete the vm and re-create it and ran it again.
This time I got more errors:
This one seems like it might be just a warning ...
ImportError: file_cache is unavailable when using oauth2client >=
4.0.0 or google-auth
Not sure about this one ...
ERROR:tensorflow:Operation of type Placeholder (reshape_input) is not supported on the TPU. Execution will fail if this op is used in the graph.
this one seemed to kill the training ...
INFO:tensorflow:Error recorded from training_loop: File system scheme '[local]' not implemented (file: '/tmp/tmpaiggRW/model.ckpt-0_temp_9216e11a1368405795d9b5282775f562') [[{{node save/SaveV2}} = SaveV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT64],
_device="/job:worker/replica:0/task:0/device:CPU:0"](save/ShardedFilename, save/SaveV2/tensor_names, save/SaveV2/shape_and_slices, conv2d/bias/Read/ReadVariableOp, conv2d/kernel/Read/ReadVariableOp, conv2d_1/bias/Read/ReadVariableOp, conv2d_1/kernel/Read/ReadVariableOp, dense/bias/Read/ReadVariableOp, dense/kernel/Read/ReadVariableOp, dense_1/bias/Read/ReadVariableOp, dense_1/kernel/Read/ReadVariableOp, global_step/Read/ReadVariableOp)]]
Caused by op u'save/SaveV2', defined at: File "/usr/share/models/official/mnist/mnist_tpu.py", line 173, in <module>
tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv)) File "/usr/share/models/official/mnist/mnist_tpu.py", line 163, in main
estimator.train(input_fn=train_input_fn, max_steps=FLAGS.train_steps) File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2394, in train
saving_listeners=saving_listeners File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 356, in train
loss = self._train_model(input_fn, hooks, saving_listeners) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1181, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1215, in _train_model_default
saving_listeners) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1406, in _train_with_estimator_spec
log_step_count_steps=self._config.log_step_count_steps) as mon_sess: File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 504, in MonitoredTrainingSession
stop_grace_period_secs=stop_grace_period_secs) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 921, in __init__
stop_grace_period_secs=stop_grace_period_secs) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 643, in __init__
self._sess = _RecoverableSession(self._coordinated_creator) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1107, in __init__
_WrappedSession.__init__(self, self._create_session()) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 1112, in _create_session
return self._sess_creator.create_session() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 800, in create_session
self.tf_sess = self._session_creator.create_session() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 557, in create_session
self._scaffold.finalize() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 215, in finalize
self._saver.build() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1106, in build
self._build(self._filename, build_save=True, build_restore=True) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1143, in _build
build_save=build_save, build_restore=build_restore) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 778, in _build_internal
save_tensor = self._AddShardedSaveOps(filename_tensor, per_device) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 369, in _AddShardedSaveOps
return self._AddShardedSaveOpsForV2(filename_tensor, per_device) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 343, in _AddShardedSaveOpsForV2
sharded_saves.append(self._AddSaveOps(sharded_filename, saveables)) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 284, in _AddSaveOps
save = self.save_op(filename_tensor, saveables) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 202, in save_op
tensors) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 1690, in save_v2
shape_and_slices=shape_and_slices, tensors=tensors, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3272, in create_op
op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1768, in __init__
self._traceback = tf_stack.extract_stack()
UnimplementedError (see above for traceback): File system scheme '[local]' not implemented (file: '/tmp/tmpaiggRW/model.ckpt-0_temp_9216e11a1368405795d9b5282775f562')
UPDATE
I get this error ...
INFO:tensorflow:Error recorded from training_loop: File system scheme '[local]' not implemented
... even when --use_tpu=False
alexryan#alex-tpu:~/tpu$ cat train-mnist.sh
python /usr/share/models/official/mnist/mnist_tpu.py \
--tpu=$TPU_NAME \
--DATA_DIR=${STORAGE_BUCKET}/data \
--MODEL_DIR=${STORAGE_BUCKET}/output \
--use_tpu=False \
--iterations=500 \
--train_steps=2000
This stack overflow answer suggests that the tpu is trying to write to a non-existent file system instead of the gcs bucket I specified. It is unclear to me why that would be happening.
In the first scenario, it seems the TPU you created is not in healthy state. So, deleting and recreating the TPU or the entire VM is the right way to resolve this.
I think the error comes in second scenario (where you deleted the vm and re-created it again) is because your ${STORAGE_BUCKET} is either undefined or not a proper GCS bucket. It should be a GCS bucket. Local path won't work and gives the following error.
More information on creating a GCS bucket is in the section "Create a Cloud Storage bucket" at https://cloud.google.com/tpu/docs/tutorials/mnist
Hope this answers your question.
Ran into the same problem and found that there was a typo in the tutorial. If you check mnist_tpu.py you'll find that the params need to be lowercase.
If you change that, it works fine.
python /usr/share/models/official/mnist/mnist_tpu.py \
--tpu=$TPU_NAME \
--data_dir=${STORAGE_BUCKET}/data \
--model_dir=${STORAGE_BUCKET}/output \
--use_tpu=True \
--iterations=500 \
--train_steps=2000
I am using the google cloud vm instance for developing my custom object detector- TENSORFLOW object detection API. I am using pretrained model
:faster_rcnn_inception_resnet_v2_atrous_coco.
After creating all the necessary TFrecord files for input and configuring the object_detection pipeline config files, i used the following command for training:
python train.py --logtostderr --train_dir=training /
--pipeline_config_path=training/faster_rcnn_custom.config
I get the following error:
Traceback (most recent call last):
File "train.py", line 184, in <module>
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train.py", line 180, in main
graph_hook_fn=graph_rewriter_fn)
File "/opt/models/research/object_detection/trainer.py", line 274, in train
train_config.prefetch_queue_capacity, data_augmentation_options)
File "/opt/models/research/object_detection/trainer.py", line 59, in create_input_queue
tensor_dict = create_tensor_dict_fn()
File "train.py", line 121, in get_next
dataset_builder.build(config)).get_next()
File "/opt/models/research/object_detection/builders/dataset_builder.py", line 176, in build
num_additional_channels=num_additional_channels)
File "/opt/models/research/object_detection/data_decoders/tf_example_decoder.py", line 204, in __init__
repeated=True)
TypeError: __init__() got an unexpected keyword argument 'repeated'
How should i fix the error? i am quite new into this. Any help would be appreciated.
Check the correctness of your command and if the config file is in correct relative nested directory. I see there is a space between "training /" it should be "training/"
My assumption is the error is due to the incompatibility of file tf_example_decoder.py with the Tensorflow installed. Try removing that argument. Hopefully this helps.
I had the similar issue. i had older tensorflow installed and trying to use new models, upgrading tensorflow solved my problem.
I've built Tensorflow with custom SIMD extensions and created a wheel for it. If I simply do pip install /tmp/tensorflow_pkg/tensorflow-1.0.0-cp27-cp27mu-linux_x86_64.wh on the box that I built it on, that works. However if I upload the whl file to cloud storage, and do pip install https://storage.cloud.google.com/path/to/tensorflow-1.0.0-cp27-cp27mu-linux_x86_64.whl I get this error:
Collecting tensorflow==1.0.0 from https://storage.cloud.google.com/path/to/tensorflow-1.0.0-cp27-cp27mu-linux_x86_64.whl
Downloading https://storage.cloud.google.com/path/to/tensorflow-1.0.0-cp27-cp27mu-linux_x86_64.whl
Exception:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/pip/basecommand.py", line 215, in main
status = self.run(options, args)
File "/usr/local/lib/python2.7/site-packages/pip/commands/install.py", line 335, in run
wb.build(autobuilding=True)
File "/usr/local/lib/python2.7/site-packages/pip/wheel.py", line 749, in build
self.requirement_set.prepare_files(self.finder)
File "/usr/local/lib/python2.7/site-packages/pip/req/req_set.py", line 380, in prepare_files
ignore_dependencies=self.ignore_dependencies))
File "/usr/local/lib/python2.7/site-packages/pip/req/req_set.py", line 620, in _prepare_file
session=self.session, hashes=hashes)
File "/usr/local/lib/python2.7/site-packages/pip/download.py", line 821, in unpack_url
hashes=hashes
File "/usr/local/lib/python2.7/site-packages/pip/download.py", line 663, in unpack_http_url
unpack_file(from_path, location, content_type, link)
File "/usr/local/lib/python2.7/site-packages/pip/utils/__init__.py", line 599, in unpack_file
flatten=not filename.endswith('.whl')
File "/usr/local/lib/python2.7/site-packages/pip/utils/__init__.py", line 484, in unzip_file
zip = zipfile.ZipFile(zipfp, allowZip64=True)
File "/usr/local/lib/python2.7/zipfile.py", line 770, in __init__
self._RealGetContents()
File "/usr/local/lib/python2.7/zipfile.py", line 811, in _RealGetContents
raise BadZipfile, "File is not a zip file"
BadZipfile: File is not a zip file
Do I need to configure my build differently somehow?
(capturing the solution as an answer)
The URL used for the download is not correct. The base url needed to be storage.googleapis.com