When I try to generate TF record, I'm getting the following error message:
Traceback (most recent call last):
File "generate_tfrecord.py", line 112, in <module>
tf.app.run()
File "/home/harisohmnaathss/anaconda3/envs/tensorflow/lib/python3.5/site-
packages/tensorflow/python/platform/app.
py", line 124, in run
_sys.exit(main(argv))
File "generate_tfrecord.py", line 98, in main
writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
File "/home/harisohmnaathss/anaconda3/envs/tensorflow/lib/python3.5/site-
packages/tensorflow/python/lib/io/tf_rec
ord.py", line 106, in __init__
compat.as_bytes(path), compat.as_bytes(compression_type), status)
File "/home/harisohmnaathss/anaconda3/envs/tensorflow/lib/python3.5/site-
packages/tensorflow/python/framework/err
ors_impl.py", line 473, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: ; No such file or
directory
The command that I try to run is:
python generate_tfrecord.py --csv_input=data/Train_labels.csv
--output_path=data/train.records
Any ideas to solve this issue?
You are supplying output path as data/train.records instead of data/train.tfrecord
Related
I'm trying to run on cloud this deep learning model:
https://github.com/razvanmarinescu/brgm#image-reconstruction-with-pre-trained-stylegan2-generators
What I do is simply utilizing their Colab Notebook: https://colab.research.google.com/drive/1G7_CGPHZVGFWIkHOAke4HFg06-tNHIZ4?usp=sharing#scrollTo=qMgE6QFiHuSL
When I try to exectute:
!python recon.py recon-real-images --input=/content/drive/MyDrive/boeing/EDGEconnect/val_imgs --masks=/content/drive/MyDrive/boeing/EDGEconnect/val_masks --tag=brains --network=dropbox:brains.pkl --recontype=inpaint --num-steps=1000 --num-snapshots=1
I receive this error:
args: Namespace(command='recon-real-images', input='/content/drive/MyDrive/boeing/EDGEconnect/val_imgs', masks='/content/drive/MyDrive/boeing/EDGEconnect/val_masks', network_pkl='dropbox:brains.pkl', num_snapshots=1, num_steps=1000, recontype='inpaint', superres_factor=4, tag='brains')
Local submit - run_dir: results/00004-brains-inpaint
dnnlib: Running recon.recon_real_images() on localhost...
Processing image 1/4
Loading networks from "dropbox:brains.pkl"...
Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Loading... Failed!
Traceback (most recent call last):
File "recon.py", line 270, in <module>
main()
File "recon.py", line 263, in main
dnnlib.submit_run(sc, func_name_map[subcmd], **kwargs)
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/submission/submit.py", line 343, in submit_run
return farm.submit(submit_config, host_run_dir)
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/submission/internal/local.py", line 22, in submit
return run_wrapper(submit_config)
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/submission/submit.py", line 280, in run_wrapper
run_func_obj(**submit_config.run_func_kwargs)
File "/content/drive/MyDrive/boeing/brgm/brgm/recon.py", line 189, in recon_real_images
recon_real_one_img(network_pkl, img_list[image_idx], masks, num_snapshots, recontype, superres_factor, num_steps)
File "/content/drive/MyDrive/boeing/brgm/brgm/recon.py", line 132, in recon_real_one_img
_G, _D, Gs = pretrained_networks.load_networks(network_pkl)
File "/content/drive/MyDrive/boeing/brgm/brgm/pretrained_networks.py", line 83, in load_networks
G, D, Gs = pickle.load(stream, encoding='latin1')
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/network.py", line 297, in __setstate__
self._init_graph()
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/network.py", line 154, in _init_graph
out_expr = self._build_func(*self.input_templates, **build_kwargs)
File "<string>", line 395, in G_synthesis_stylegan2
File "<string>", line 359, in layer
File "<string>", line 106, in modulated_conv2d_layer
File "<string>", line 75, in apply_bias_act
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/ops/fused_bias_act.py", line 68, in fused_bias_act
return impl_dict[impl](x=x, b=b, axis=axis, act=act, alpha=alpha, gain=gain)
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/ops/fused_bias_act.py", line 122, in _fused_bias_act_cuda
cuda_kernel = _get_plugin().fused_bias_act
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/ops/fused_bias_act.py", line 16, in _get_plugin
return custom_ops.get_plugin(os.path.splitext(__file__)[0] + '.cu')
File "/content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/custom_ops.py", line 156, in get_plugin
plugin = tf.load_op_library(bin_file)
File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/_cudacache/fused_bias_act_237d55aca3e3c3ec0547da06888d8e66.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
I found that the very last part of an error:
tensorflow.python.framework.errors_impl.NotFoundError: /content/drive/MyDrive/boeing/brgm/brgm/dnnlib/tflib/_cudacache/fused_bias_act_237d55aca3e3c3ec0547da06888d8e66.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
Can be solved by changing a flag in Cuda Makefile: https://github.com/mgharbi/hdrnet_legacy/issues/2 or by installing tf 1.14(colab runs on 1.15.2 and this change made no positive effect).
My question is, how can I get rid of this error, is there an option to change smth inside Google Colab's Cuda Makefile?
The issue started appearing over the weekend. For some reason, it feels to be a DataFlow issue.
Previously, I was able to execute the script and write TF records just fine. However, now, I am unable to initialize the computation graph to process the data.
The traceback is:
Traceback (most recent call last):
File "my_script.py", line 1492, in <module>
MyBeamClass()
File "my_script.py", line 402, in __init__
self.run()
File "my_script.py", line 514, in run
transform_fn_io.WriteTransformFn(path=self.JOB_DIR + '/transform/'))
File "/anaconda3/envs/ml27/lib/python2.7/site-packages/apache_beam/pipeline.py", line 426, in __exit__
self.run().wait_until_finish()
File "/anaconda3/envs/ml27/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1238, in wait_until_finish
(self.state, getattr(self._runner, 'last_error_msg', None)), self)
apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 649, in do_work
work_executor.execute()
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 176, in execute
op.start()
File "apache_beam/runners/worker/operations.py", line 531, in apache_beam.runners.worker.operations.DoOperation.start
def start(self):
File "apache_beam/runners/worker/operations.py", line 532, in apache_beam.runners.worker.operations.DoOperation.start
with self.scoped_start_state:
File "apache_beam/runners/worker/operations.py", line 533, in apache_beam.runners.worker.operations.DoOperation.start
super(DoOperation, self).start()
File "apache_beam/runners/worker/operations.py", line 202, in apache_beam.runners.worker.operations.Operation.start
def start(self):
File "apache_beam/runners/worker/operations.py", line 206, in apache_beam.runners.worker.operations.Operation.start
self.setup()
File "apache_beam/runners/worker/operations.py", line 480, in apache_beam.runners.worker.operations.DoOperation.setup
with self.scoped_start_state:
File "apache_beam/runners/worker/operations.py", line 485, in apache_beam.runners.worker.operations.DoOperation.setup
pickler.loads(self.spec.serialized_fn))
File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 247, in loads
return dill.loads(s)
File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 317, in loads
return load(file, ignore)
File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 305, in load
obj = pik.load()
File "/usr/lib/python2.7/pickle.py", line 864, in load
dispatch[key](self)
File "/usr/lib/python2.7/pickle.py", line 1232, in load_build
for k, v in state.iteritems():
AttributeError: 'str' object has no attribute 'iteritems'
I am using tensorflow==1.13.1 and tensorflow-transform==0.9.0 and apache_beam==2.7.0
with beam.Pipeline(options=self.pipe_opt) as p:
with beam_impl.Context(temp_dir=self.google_cloud_options.temp_location):
# rest of the script
_ = (
transform_fn
| 'WriteTransformFn' >>
transform_fn_io.WriteTransformFn(path=self.JOB_DIR + '/transform/'))
I was experiencing the same error.
It seems to be triggered by a mismatch in the tensorflow-transform versions of your local (or master) machine and the workers one (specified in the setup.py file).
In my case I was running tensorflow-transform==0.13 on my local machine whereas the workers were running 0.8.
Downgrading the local version to 0.8 fixed the issue.
I got tensorflow and object detection API on my machine.
Test run shows that everything works.
~ $ cd models/research
research $ protoc object_detection/protos/*.proto --python_out=.
research $ export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
research $ python3 object_detection/builders/model_builder_test.py
...............
----------------------------------------------------------------------
Ran 15 tests in 0.144s
OK
Then I tried to retrain a model and got the protobuf error
research $ cd object_detection
object_detection $ python3 train.py --logtostderr --train_dir=training/ --pipeline_config_path=ssdlite_mobilenet_v2_coco_2018_05_09/pipeline.config
WARNING:tensorflow:From /Users/me/models/research/object_detection/trainer.py:257: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.create_global_step
Traceback (most recent call last):
File "/Users/me/models/research/object_detection/utils/label_map_util.py", line 135, in load_labelmap
text_format.Merge(label_map_string, label_map)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 533, in Merge
descriptor_pool=descriptor_pool)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 587, in MergeLines
return parser.MergeLines(lines, message)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 620, in MergeLines
self._ParseOrMerge(lines, message)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 635, in _ParseOrMerge
self._MergeField(tokenizer, message)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 735, in _MergeField
merger(tokenizer, message, field)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 823, in _MergeMessageField
self._MergeField(tokenizer, sub_message)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 722, in _MergeField
tokenizer.Consume(':')
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/text_format.py", line 1087, in Consume
raise self.ParseError('Expected "%s".' % token)
google.protobuf.text_format.ParseError: 3:10 : Expected ":".
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 1083, in MergeFromString
if self._InternalParse(serialized, 0, length) != length:
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 1105, in InternalParse
(tag_bytes, new_pos) = local_ReadTag(buffer, pos)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/internal/decoder.py", line 181, in ReadTag
while six.indexbytes(buffer, pos) & 0x80:
TypeError: unsupported operand type(s) for &: 'str' and 'int'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 184, in <module>
tf.app.run()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "train.py", line 180, in main
graph_hook_fn=graph_rewriter_fn)
File "/Users/me/models/research/object_detection/trainer.py", line 264, in train
train_config.prefetch_queue_capacity, data_augmentation_options)
File "/Users/me/models/research/object_detection/trainer.py", line 59, in create_input_queue
tensor_dict = create_tensor_dict_fn()
File "train.py", line 121, in get_next
dataset_builder.build(config)).get_next()
File "/Users/me/models/research/object_detection/builders/dataset_builder.py", line 155, in build
label_map_proto_file=label_map_proto_file)
File "/Users/me/models/research/object_detection/data_decoders/tf_example_decoder.py", line 245, in __init__
use_display_name)
File "/Users/me/models/research/object_detection/utils/label_map_util.py", line 152, in get_label_map_dict
label_map = load_labelmap(label_map_path)
File "/Users/me/models/research/object_detection/utils/label_map_util.py", line 137, in load_labelmap
label_map.ParseFromString(label_map_string)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/message.py", line 185, in ParseFromString
self.MergeFromString(serialized)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 1089, in MergeFromString
raise message_mod.DecodeError('Truncated message.')
google.protobuf.message.DecodeError: Truncated message.
object_detection $
I tried a bunch of similar problems' solutions, but they didn't work for my case. For example this one suggest to encode pbtxt file with ASCII.
Python 2 gives an error too. Here is it's last line
google.protobuf.message.DecodeError: Unexpected end-group tag.
Context:
macOS 10.13.4
Local run on CPU
Python 3.6.4
protobuf 3.5.1
libprotoc 3.4.0
tensorflow 1.8.0
Google Cloud SDK 200.0.0
bq 2.0.33
core 2018.04.30
gsutil 4.31
I am trying to train my own model for tensorflow object detection, I followed this and this tutorials and in last step I tried to run this command
> python train.py --logtostderr --train_dir=training/ --
pipeline_config_path=training/ssd_mobilenet_v1_pets.config
but I get this error
> Traceback (most recent call last):
File "train.py", line 163, in <module>
tf.app.run()
File "C:\Program Files\Python36\lib\site-packages\tensorflow\python\platform\a
pp.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "train.py", line 91, in main
FLAGS.pipeline_config_path)
File "C:\Libraries\models-master\research\object_detection\utils\config_util.p
y", line 43, in get_configs_from_pipeline_file
text_format.Merge(proto_str, pipeline_config)
File "C:\Program Files\Python36\lib\site-packages\google\protobuf\text_format.
py", line 533, in Merge
descriptor_pool=descriptor_pool)
File "C:\Program Files\Python36\lib\site-packages\google\protobuf\text_format.
py", line 587, in MergeLines
return parser.MergeLines(lines, message)
File "C:\Program Files\Python36\lib\site-packages\google\protobuf\text_format.
py", line 620, in MergeLines
self._ParseOrMerge(lines, message)
File "C:\Program Files\Python36\lib\site-packages\google\protobuf\text_format.
py", line 635, in _ParseOrMerge
self._MergeField(tokenizer, message)
File "C:\Program Files\Python36\lib\site-packages\google\protobuf\text_format.
py", line 703, in _MergeField
(message_descriptor.full_name, name))
google.protobuf.text_format.ParseError: 195:3 : Message type "object_detection.p
rotos.TrainEvalPipelineConfig" has no field named "shuffle".
how can I solve it ?
I fix that problem with changing .config file. I used ssd_mobilenet_v1_coco.config instead of ssd_mobilenet_v1_pets.config
I've cloned depot tools given instructions from here on ubuntu 17.04.
When I run fetch chromium I get the following error.
./fetch chromium
Running: gclient root
Traceback (most recent call last):
File "./fetch.py", line 299, in <module>
sys.exit(main())
File "./fetch.py", line 294, in main
return run(options, spec, root)
File "./fetch.py", line 280, in run
if not options.force and checkout.exists():
File "./fetch.py", line 82, in exists
gclient_root = self.run_gclient('root').strip()
File "./fetch.py", line 78, in run_gclient
return self.run(cmd_prefix + cmd, **kwargs)
File "./fetch.py", line 68, in run
return subprocess.check_output(cmd, **kwargs)
File "/usr/lib/python2.7/subprocess.py", line 212, in check_output
process = Popen(stdout=PIPE, *popenargs, **kwargs)
File "/usr/lib/python2.7/subprocess.py", line 390, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1024, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
I have come across similar errors in stackoverflow while searching for a resolution however none of them helped resolve the problem.
I'm not clear on if there is any steps required after cloning to install depot tools? The readme
Any thoughts on what is required to solve the problem?
Conteh