Error occur when I'm trying to execute train.py - tensorflow

When I execute
python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config
I got this error:
WARNING:tensorflow: The TensorFlow contrib module will not be included
in TensorFlow 2.0. For more information, please see: *
https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons * https://github.com/tensorflow/io (for I/O related ops) If you depend
on functionality not listed there, please file an issue.
WARNING:tensorflow:From train.py:55: The name tf.logging.set_verbosity
is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.
WARNING:tensorflow:From train.py:55: The name tf.logging.INFO is
deprecated. Please use tf.compat.v1.logging.INFO instead.
WARNING:tensorflow:From train.py:167: The name tf.app.run is
deprecated. Please use tf.compat.v1.app.run instead.
WARNING:tensorflow:From train.py:89: The name tf.gfile.MakeDirs is
deprecated. Please use tf.io.gfile.makedirs instead.
W1212 22:01:57.353342 3060 deprecation_wrapper.py:119] From
train.py:89: The name tf.gfile.MakeDirs is deprecated. Please use
tf.io.gfile.makedirs instead.
WARNING:tensorflow:From
c:\users\aamir\desktop\models\research\object_detection\utils\config_util.py:86:
The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile
instead.
W1212 22:01:57.354341 3060 deprecation_wrapper.py:119] From
c:\users\aamir\desktop\models\research\object_detection\utils\config_util.py:86:
The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile
instead.
WARNING:tensorflow:From train.py:94: The name tf.gfile.Copy is
deprecated. Please use tf.io.gfile.copy instead.
W1212 22:01:57.358338 3060 deprecation_wrapper.py:119] From
train.py:94: The name tf.gfile.Copy is deprecated. Please use
tf.io.gfile.copy instead.
WARNING:tensorflow:From
c:\users\aamir\desktop\models\research\object_detection\anchor_generators\grid_anchor_generator.py:59:
to_float (from tensorflow.python.ops.math_ops) is deprecated and will
be removed in a future version. Instructions for updating: Use
`tf.cast` instead. W1212 22:01:57.401396 3060 deprecation.py:323]
From
c:\users\aamir\desktop\models\research\object_detection\anchor_generators\grid_anchor_generator.py:59:
to_float (from tensorflow.python.ops.math_ops) is deprecated and will
be removed in a future version. Instructions for updating: Use
`tf.cast` instead. INFO:tensorflow:Scale of 0 disables regularizer.
I1212 22:01:57.406377 3060 regularizers.py:98] Scale of 0 disables
regularizer. INFO:tensorflow:Scale of 0 disables regularizer. I1212
22:01:57.406377 3060 regularizers.py:98] Scale of 0 disables
regularizer. WARNING:tensorflow:From
c:\users\aamir\desktop\models\research\object_detection\trainer.py:228:
create_global_step (from
tensorflow.contrib.framework.python.ops.variables) is deprecated and
will be removed in a future version. Instructions for updating: Please
switch to tf.train.create_global_step W1212 22:01:57.408376 3060
deprecation.py:323] From
c:\users\aamir\desktop\models\research\object_detection\trainer.py:228:
create_global_step (from
tensorflow.contrib.framework.python.ops.variables) is deprecated and
will be removed in a future version. Instructions for updating: Please
switch to tf.train.create_global_step WARNING:tensorflow:From
c:\users\aamir\desktop\models\research\object_detection\data_decoders\tf_example_decoder.py:104:
The name tf.FixedLenFeature is deprecated. Please use
tf.io.FixedLenFeature instead.
W1212 22:01:57.413390 3060 deprecation_wrapper.py:119] From
c:\users\aamir\desktop\models\research\object_detection\data_decoders\tf_example_decoder.py:104:
The name tf.FixedLenFeature is deprecated. Please use
tf.io.FixedLenFeature instead.
WARNING:tensorflow:From
c:\users\aamir\desktop\models\research\object_detection\data_decoders\tf_example_decoder.py:119:
The name tf.VarLenFeature is deprecated. Please use
tf.io.VarLenFeature instead.
W1212 22:01:57.414372 3060 deprecation_wrapper.py:119] From
c:\users\aamir\desktop\models\research\object_detection\data_decoders\tf_example_decoder.py:119:
The name tf.VarLenFeature is deprecated. Please use
tf.io.VarLenFeature instead.
Traceback (most recent call last): File "train.py", line 167, in
<module>
tf.app.run() File "C:\Users\Aamir\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py",
line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File
"C:\Users\Aamir\Anaconda3\lib\site-packages\absl\app.py", line 299, in
run
_run_main(main, args) File "C:\Users\Aamir\Anaconda3\lib\site-packages\absl\app.py", line 250, in
_run_main
sys.exit(main(argv)) File "train.py", line 163, in main
worker_job_name, is_chief, FLAGS.train_dir) File "c:\users\aamir\desktop\models\research\object_detection\trainer.py",
line 235, in train
train_config.prefetch_queue_capacity, data_augmentation_options) File
"c:\users\aamir\desktop\models\research\object_detection\trainer.py",
line 59, in create_input_queue
tensor_dict = create_tensor_dict_fn() File "train.py", line 120, in get_next
dataset_builder.build(config)).get_next() File "c:\users\aamir\desktop\models\research\object_detection\builders\dataset_builder.py",
line 138, in build
label_map_proto_file=label_map_proto_file) File "c:\users\aamir\desktop\models\research\object_detection\data_decoders\tf_example_decoder.py",
line 195, in __init__
use_display_name) File "c:\users\aamir\desktop\models\research\object_detection\utils\label_map_util.py",
line 149, in get_label_map_dict
label_map = load_labelmap(label_map_path) File "c:\users\aamir\desktop\models\research\object_detection\utils\label_map_util.py",
line 129, in load_labelmap
label_map_string = fid.read() File "C:\Users\Aamir\Anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py",
line 122, in read
self._preread_check() File "C:\Users\Aamir\Anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py",
line 84, in _preread_check
compat.as_bytes(self.__name), 1024 * 512) tensorflow.python.framework.errors_impl.NotFoundError:
NewRandomAccessFile failed to Create/Open:
C:/Users/Aamir/Desktop/models/research/object_detection/training/object-detection.pbtxt
: The system cannot find the file specified. ; No such file or
directory

Well somewhere in your code you use the file C:/Users/Aamir/Desktop/models/research/object_detection/training/object-detection.pbtxt
but it's missing. Maybe this file is in a different folder or you have a typo

Related

AssertionError: yolov3/convolutional59/ is not in graph

I am trying to convert yolov3 weights to tflite using DW2TF.
Here is the tutorial I am following.
When I am trying to execute the following statement I am getting an error.
!python to_frozen_graph.py --model_dir data --output_node_names yolov3/convolutional59/
Here is the error.
WARNING:tensorflow:From to_frozen_graph.py:18: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.
WARNING:tensorflow:From to_frozen_graph.py:39: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2022-06-12 15:38:28.599693: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2022-06-12 15:38:28.608002: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-06-12 15:38:28.608087: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ac9eb598934a): /proc/driver/nvidia/version does not exist
2022-06-12 15:38:28.615211: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200220000 Hz
2022-06-12 15:38:28.615511: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2ed12c0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-06-12 15:38:28.615558: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
WARNING:tensorflow:From to_frozen_graph.py:41: The name tf.train.import_meta_graph is deprecated. Please use tf.compat.v1.train.import_meta_graph instead.
WARNING:tensorflow:From to_frozen_graph.py:49: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
WARNING:tensorflow:From to_frozen_graph.py:50: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/graph_util_impl.py:277: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
Traceback (most recent call last):
File "to_frozen_graph.py", line 66, in <module>
freeze_graph(args.model_dir, args.output_node_names)
File "to_frozen_graph.py", line 50, in freeze_graph
output_node_names.split(",") # The output node names are used to select the usefull nodes
File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/graph_util_impl.py", line 277, in convert_variables_to_constants
inference_graph = extract_sub_graph(input_graph_def, output_node_names)
File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/graph_util_impl.py", line 197, in extract_sub_graph
_assert_nodes_are_present(name_to_node, dest_nodes)
File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/graph_util_impl.py", line 152, in _assert_nodes_are_present
assert d in name_to_node, "%s is not in graph" % d
AssertionError: yolov3/convolutional59/ is not in graph
What can I try to solve this?

Tensor2tensor: My customed problem never registered with registry problems

Description
I am following a tutorial of microsoft from this website
to get a model to inference Chinese couplet.
Now I have trained the model on Google cloud and I can also get good inference.
Howerver, when I am constructing inference service, I found my function to communicate with tensorflowserverapi can't find my problem get registered.
I also have trained this model for one step and add t2t_trainer --registry_help, and I can see my problem is actually registered under problems Problems.
My code is just the same as the one in this repo script
And here is my test code:
from up2down_model.up2down_model import up2down
upper_couplet = input()
up2down.get_down_couplet([upper_couplet])
Environment information:
OS: Ubuntu 20.04
$ pip freeze | grep tensor
tensor2tensor 1.15.6
tensorboard 1.14.0
tensorflow 1.14.0
tensorflow-addons 0.10.0
tensorflow-datasets 1.3.0
tensorflow-estimator 1.14.0
tensorflow-gan 2.0.0
tensorflow-hub 0.8.0
tensorflow-metadata 0.22.0
tensorflow-probability 0.7.0
tensorflow-serving-api 1.14.0
$ python -3.7.7
Error logs:
raceback (most recent call last):
File "/home/enigma/anaconda3/envs/NLP/lib/python3.7/site-packages/tensor2tensor/utils/registry.py", line 509, in problem
return Registries.problems[spec.base_name](
File "/home/enigma/anaconda3/envs/NLP/lib/python3.7/site-packages/tensor2tensor/utils/registry.py", line 254, in __getitem__
(key, self.name, display_list_by_prefix(sorted(self), 4)))
KeyError: 'translate_up2down never registered with registry problems. Available:
All problems without my own
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 1, in <module>
from up2down_model.up2down_model import up2down
File "/home/enigma/Desktop/NLP/service/up2down_model/up2down_model.py", line 85, in <module>
up2down = up2down_class(FLAGS,server_address) # inference model
File "/home/enigma/Desktop/NLP/service/up2down_model/up2down_model.py", line 40, in __init__
self.problem = registry.problem(self.FLAGS.problem)
File "/home/enigma/anaconda3/envs/NLP/lib/python3.7/site-packages/tensor2tensor/utils/registry.py", line 513, in problem
return env_problem(problem_name, **kwargs)
File "/home/enigma/anaconda3/envs/NLP/lib/python3.7/site-packages/tensor2tensor/utils/registry.py", line 527, in env_problem
ep_cls = Registries.env_problems[env_problem_name]
File "/home/enigma/anaconda3/envs/NLP/lib/python3.7/site-packages/tensor2tensor/utils/registry.py", line 254, in __getitem__
(key, self.name, display_list_by_prefix(sorted(self), 4)))
KeyError: 'translate_up2down never registered with registry env_problems. Available:\n reacher:\n * reacher_env_problem\n tic:\n * tic_tac_toe_env_problem'

export_model.py - Not found: Tensor name "MobilenetV2/Conv/BatchNorm/beta" not found in checkpoint files

I've been trying to train my own deeplab model from https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/pascal.md.
I'm running everything on Google Colab.
I've been able to train the model fine:
%%shell
export PYTHONPATH=$PYTHONPATH:"/content/models/research":"/content/models/research/slim"
NUM_ITERATIONS=50
python3 train.py \
--logtostderr \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=200,200 \
--train_batch_size=12 \
--training_number_of_steps="${NUM_ITERATIONS}" \
--fine_tune_batch_norm=true \
--tf_initial_checkpoint="/content/deeplabv3_pascal_train_aug/model.ckpt.index" \
--train_logdir="/content/output" \
--dataset_dir="/content/drive/My Drive/Colab Notebooks/Background Removal/tfrecord"
And create visualizations fine:
%%shell
export PYTHONPATH=$PYTHONPATH:"/content/models/research":"/content/models/research/slim"
python3 vis.py \
--logtostderr \
--vis_split="val" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--vis_crop_size=200,200 \
--checkpoint_dir=/content/output \
--vis_logdir=/content/output/vis \
--dataset_dir="/content/drive/My Drive/Colab Notebooks/Background Removal/tfrecord" \
--max_number_of_iterations=1
But running export_model.py does not work. I thought it might have been an issue with the model I have trained, so I tried exporting the initial checkpoint I am training off of - it doesn't work either.
%%shell
export PYTHONPATH=$PYTHONPATH:"/content/models/research":"/content/models/research/slim"
NUM_ITERATIONS=50
python3 export_model.py \
--logtostderr \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--crop_size=200 \
--crop_size=200 \
--checkpoint_path='/content/output/model.ckpt-50.index' \
--export_path='/content/output'
Full output from running export_model.py:
WARNING:tensorflow:From /content/models/research/deeplab/core/conv2d_ws.py:40: The name tf.layers.Layer is deprecated. Please use tf.compat.v1.layers.Layer instead.
WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
* https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
* https://github.com/tensorflow/addons
* https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.
WARNING:tensorflow:From export_model.py:201: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.
WARNING:tensorflow:From export_model.py:117: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.
W0329 17:24:00.753659 139709292058496 module_wrapper.py:139] From export_model.py:117: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.
WARNING:tensorflow:From export_model.py:117: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.
W0329 17:24:00.753914 139709292058496 module_wrapper.py:139] From export_model.py:117: The name tf.logging.INFO is deprecated. Please use tf.compat.v1.logging.INFO instead.
WARNING:tensorflow:From export_model.py:118: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.
W0329 17:24:00.754124 139709292058496 module_wrapper.py:139] From export_model.py:118: The name tf.logging.info is deprecated. Please use tf.compat.v1.logging.info instead.
INFO:tensorflow:Prepare to export model to: /content/output
I0329 17:24:00.754279 139709292058496 export_model.py:118] Prepare to export model to: /content/output
WARNING:tensorflow:From export_model.py:91: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
W0329 17:24:00.755340 139709292058496 module_wrapper.py:139] From export_model.py:91: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
INFO:tensorflow:Exported model performs single-scale inference.
I0329 17:24:00.817728 139709292058496 export_model.py:130] Exported model performs single-scale inference.
WARNING:tensorflow:From /content/models/research/deeplab/model.py:320: The name tf.AUTO_REUSE is deprecated. Please use tf.compat.v1.AUTO_REUSE instead.
W0329 17:24:00.818036 139709292058496 module_wrapper.py:139] From /content/models/research/deeplab/model.py:320: The name tf.AUTO_REUSE is deprecated. Please use tf.compat.v1.AUTO_REUSE instead.
WARNING:tensorflow:From /content/models/research/deeplab/core/feature_extractor.py:461: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0329 17:24:00.818522 139709292058496 deprecation.py:323] From /content/models/research/deeplab/core/feature_extractor.py:461: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
WARNING:tensorflow:From /content/models/research/deeplab/core/feature_extractor.py:75: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.
W0329 17:24:00.821603 139709292058496 module_wrapper.py:139] From /content/models/research/deeplab/core/feature_extractor.py:75: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.
WARNING:tensorflow:From /tensorflow-1.15.2/python3.6/tensorflow_core/contrib/layers/python/layers/layers.py:1057: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
W0329 17:24:00.825009 139709292058496 deprecation.py:323] From /tensorflow-1.15.2/python3.6/tensorflow_core/contrib/layers/python/layers/layers.py:1057: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
WARNING:tensorflow:From /content/models/research/deeplab/core/utils.py:41: The name tf.image.resize_bilinear is deprecated. Please use tf.compat.v1.image.resize_bilinear instead.
W0329 17:24:02.636440 139709292058496 module_wrapper.py:139] From /content/models/research/deeplab/core/utils.py:41: The name tf.image.resize_bilinear is deprecated. Please use tf.compat.v1.image.resize_bilinear instead.
WARNING:tensorflow:From export_model.py:162: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.
W0329 17:24:02.986706 139709292058496 module_wrapper.py:139] From export_model.py:162: The name tf.image.resize_images is deprecated. Please use tf.image.resize instead.
WARNING:tensorflow:From export_model.py:178: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.
W0329 17:24:02.991279 139709292058496 module_wrapper.py:139] From export_model.py:178: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.
WARNING:tensorflow:From export_model.py:178: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Please use tf.global_variables instead.
W0329 17:24:02.991502 139709292058496 deprecation.py:323] From export_model.py:178: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Please use tf.global_variables instead.
WARNING:tensorflow:From export_model.py:181: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.
W0329 17:24:03.295938 139709292058496 module_wrapper.py:139] From export_model.py:181: The name tf.gfile.MakeDirs is deprecated. Please use tf.io.gfile.makedirs instead.
WARNING:tensorflow:From export_model.py:182: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
W0329 17:24:03.296255 139709292058496 module_wrapper.py:139] From export_model.py:182: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
WARNING:tensorflow:From /tensorflow-1.15.2/python3.6/tensorflow_core/python/tools/freeze_graph.py:127: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
W0329 17:24:03.419735 139709292058496 deprecation.py:323] From /tensorflow-1.15.2/python3.6/tensorflow_core/python/tools/freeze_graph.py:127: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2020-03-29 17:24:03.901045: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-03-29 17:24:03.919472: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-29 17:24:03.920276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:04.0
2020-03-29 17:24:03.920544: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-03-29 17:24:03.922225: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-03-29 17:24:03.923832: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-03-29 17:24:03.924132: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-03-29 17:24:03.926131: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-03-29 17:24:03.927020: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-03-29 17:24:03.930883: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-03-29 17:24:03.931017: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-29 17:24:03.931838: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-29 17:24:03.932481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
2020-03-29 17:24:03.937940: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2020-03-29 17:24:03.938159: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1a83480 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-03-29 17:24:03.938192: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-03-29 17:24:03.993090: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-29 17:24:03.993934: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1a83640 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-03-29 17:24:03.993966: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla K80, Compute Capability 3.7
2020-03-29 17:24:03.994138: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-29 17:24:03.994819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:04.0
2020-03-29 17:24:03.994883: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-03-29 17:24:03.994912: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-03-29 17:24:03.994937: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-03-29 17:24:03.994960: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-03-29 17:24:03.994984: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-03-29 17:24:03.995007: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-03-29 17:24:03.995031: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-03-29 17:24:03.995121: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-29 17:24:03.995850: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-29 17:24:03.996477: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
2020-03-29 17:24:03.996539: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-03-29 17:24:03.998097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-29 17:24:03.998127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0
2020-03-29 17:24:03.998140: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N
2020-03-29 17:24:03.998307: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-29 17:24:03.999000: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-29 17:24:03.999707: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2020-03-29 17:24:03.999752: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10805 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
INFO:tensorflow:Restoring parameters from /content/output/model.ckpt-50.index
I0329 17:24:04.002565 139709292058496 saver.py:1284] Restoring parameters from /content/output/model.ckpt-50.index
Traceback (most recent call last):
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
(0) Not found: Tensor name "MobilenetV2/Conv/BatchNorm/beta" not found in checkpoint files /content/output/model.ckpt-50.index
[[{{node save/RestoreV2}}]]
(1) Not found: Tensor name "MobilenetV2/Conv/BatchNorm/beta" not found in checkpoint files /content/output/model.ckpt-50.index
[[{{node save/RestoreV2}}]]
[[save/RestoreV2/_301]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 1290, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1359, in _do_run
run_metadata)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
(0) Not found: Tensor name "MobilenetV2/Conv/BatchNorm/beta" not found in checkpoint files /content/output/model.ckpt-50.index
[[node save/RestoreV2 (defined at /tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/ops.py:1748) ]]
(1) Not found: Tensor name "MobilenetV2/Conv/BatchNorm/beta" not found in checkpoint files /content/output/model.ckpt-50.index
[[node save/RestoreV2 (defined at /tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/ops.py:1748) ]]
[[save/RestoreV2/_301]]
0 successful operations.
0 derived errors ignored.
Original stack trace for 'save/RestoreV2':
File "export_model.py", line 201, in <module>
tf.app.run()
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "export_model.py", line 178, in main
saver = tf.train.Saver(tf.all_variables())
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 828, in __init__
self.build()
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 840, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 878, in _build
build_restore=build_restore)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 508, in _build_internal
restore_sequentially, reshape)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 328, in _AddRestoreOps
restore_sequentially)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 575, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/ops/gen_io_ops.py", line 1696, in restore_v2
name=name)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 1300, in restore
names_to_keys = object_graph_key_mapping(save_path)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 1618, in object_graph_key_mapping
object_graph_string = reader.get_tensor(trackable.OBJECT_GRAPH_PROTO_KEY)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/pywrap_tensorflow_internal.py", line 915, in get_tensor
return CheckpointReader_GetTensor(self, compat.as_bytes(tensor_str))
tensorflow.python.framework.errors_impl.NotFoundError: _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint file
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "export_model.py", line 201, in <module>
tf.app.run()
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "export_model.py", line 192, in main
initializer_nodes=None)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/tools/freeze_graph.py", line 151, in freeze_graph_with_def_protos
saver.restore(sess, input_checkpoint)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 1306, in restore
err, "a Variable name or other graph key that is missing")
tensorflow.python.framework.errors_impl.NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:
2 root error(s) found.
(0) Not found: Tensor name "MobilenetV2/Conv/BatchNorm/beta" not found in checkpoint files /content/output/model.ckpt-50.index
[[node save/RestoreV2 (defined at /tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/ops.py:1748) ]]
(1) Not found: Tensor name "MobilenetV2/Conv/BatchNorm/beta" not found in checkpoint files /content/output/model.ckpt-50.index
[[node save/RestoreV2 (defined at /tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/ops.py:1748) ]]
[[save/RestoreV2/_301]]
0 successful operations.
0 derived errors ignored.
Original stack trace for 'save/RestoreV2':
File "export_model.py", line 201, in <module>
tf.app.run()
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "export_model.py", line 178, in main
saver = tf.train.Saver(tf.all_variables())
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 828, in __init__
self.build()
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 840, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 878, in _build
build_restore=build_restore)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 508, in _build_internal
restore_sequentially, reshape)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 328, in _AddRestoreOps
restore_sequentially)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/training/saver.py", line 575, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/ops/gen_io_ops.py", line 1696, in restore_v2
name=name)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "/tensorflow-1.15.2/python3.6/tensorflow_core/python/framework/ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
<ipython-input-14-46a5ede3bd50> in <module>()
----> 1 get_ipython().run_cell_magic('shell', '', 'export PYTHONPATH=$PYTHONPATH:"/content/models/research":"/content/models/research/slim"\nNUM_ITERATIONS=50\npython3 export_model.py \\\n --logtostderr \\\n --atrous_rates=6 \\\n --atrous_rates=12 \\\n --atrous_rates=18 \\\n --output_stride=16 \\\n --crop_size=200 \\\n --crop_size=200 \\\n --checkpoint_path=\'/content/output/model.ckpt-50.index\' \\\n --export_path=\'/content/output\'')
2 frames
/usr/local/lib/python3.6/dist-packages/google/colab/_system_commands.py in check_returncode(self)
136 if self.returncode:
137 raise subprocess.CalledProcessError(
--> 138 returncode=self.returncode, cmd=self.args, output=self.output)
139
140 def _repr_pretty_(self, p, cycle): # pylint:disable=unused-argument
CalledProcessError: Command 'export PYTHONPATH=$PYTHONPATH:"/content/models/research":"/content/models/research/slim"
NUM_ITERATIONS=50
python3 export_model.py \
--logtostderr \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--crop_size=200 \
--crop_size=200 \
--checkpoint_path='/content/output/model.ckpt-50.index' \
--export_path='/content/output'' returned non-zero exit status 1.
I'm aware of similar GitHub issues (https://github.com/tensorflow/models/issues/6212 and https://github.com/tensorflow/models/issues/3992), but it doesn't look like any were resolved. I also tried poking around in the export_model.py code in deeplab, but I don't understand the TF code enough to know where to look.
It is trying to search for model checkpoints trained on MobileNet-v2 backbone by default. But as you have trained your model on xception backbone. Please add '--model_variant="xception_65"' argument to your export_model.py.

Linking the French Spacy model but failing to load it

I successfully downloaded SpaCy and the French model to apply it to the Rasa starter pack. Yet when running the rasa_nlu training command it seems the OS can't find the French model.
C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
2019-05-02 19:14:58 INFO rasa_nlu.utils.spacy_utils - Trying to load spacy model with name 'fr'
Traceback (most recent call last):
File "C:\Python36\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\Python36\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\rasa_nlu\train.py", line 184, in <module>
num_threads=cmdline_args.num_threads)
File "C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\rasa_nlu\train.py", line 148, in do_train
trainer = Trainer(cfg, component_builder)
File "C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\rasa_nlu\model.py", line 155, in __init__
self.pipeline = self._build_pipeline(cfg, component_builder)
File "C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\rasa_nlu\model.py", line 166, in _build_pipeline
component_name, cfg)
File "C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\rasa_nlu\components.py", line 441, in create_component
cfg)
File "C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\rasa_nlu\registry.py", line 142, in create_component_by_name
return component_clz.create(config)
File "C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\rasa_nlu\utils\spacy_utils.py", line 73, in create
nlp = spacy.load(spacy_model_name, parser=False)
File "C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\spacy\__init__.py", line 27, in load
return util.load_model(name, **overrides)
File "C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\spacy\util.py", line 136, in load_model
raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'fr'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.
(staenv) C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack>python -m spacy download fr
Requirement already satisfied: fr_core_news_sm==2.1.0 from https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-2.1.0/fr_core_news_sm-2.1.0.tar.gz#egg=fr_core_news_sm==2.1.0 in c:\users\antoi\documents\programming\paco\starter-pack-rasa-stack\staenv\lib\site-packages (2.1.0)
You are using pip version 10.0.1, however version 19.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
✔ Download and installation successful
You can now load the model via spacy.load('fr_core_news_sm')
You do not have sufficient privilege to perform this operation.
✔ Linking successful
C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\fr_core_news_sm
-->
C:\Users\antoi\Documents\Programming\PACO\starter-pack-rasa-stack\staenv\lib\site-packages\spacy\data\fr
You can now load the model via spacy.load('fr')
My spacy version is 2.1.3
I think this is actually a SpaCy issue.
Due to the line You do not have sufficient privilege to perform this operation. I think you have to run the Windows command line as administrator.
This spaCy issues describes the same problem and gives further recommendations.

Error loading library gpuarray with Theano

I am trying to run this script to test Theano's use of my GPU and get the following error:
ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
Traceback (most recent call last):
File "/home/me/anaconda3/envs/py35/lib/python3.5/site-
packages/theano/gpuarray/__init__.py", line 164, in <module>
use(config.device)
File "/home/me/anaconda3/envs/py35/lib/python3.5/site-
packages/theano/gpuarray/__init__.py", line 151, in use
init_dev(device)
File "/home/me/anaconda3/envs/py35/lib/python3.5/site-
packages/theano/gpuarray/__init__.py", line 60, in init_dev
sched=config.gpuarray.sched)
File "pygpu/gpuarray.pyx", line 614, in pygpu.gpuarray.init
(pygpu/gpuarray.c:9419)
File "pygpu/gpuarray.pyx", line 566, in pygpu.gpuarray.pygpu_init
(pygpu/gpuarray.c:9110)
File "pygpu/gpuarray.pyx", line 1021, in
pygpu.gpuarray.GpuContext.__cinit__ (pygpu/gpuarray.c:13472)
pygpu.gpuarray.GpuArrayException: Error loading library: -1
I need to use the nvidia-381 driver since my GPU is a 1080 ti and is not compatible with nvidia-375. I'm not sure if that matters but installing nvcc overwrites 381 and causes some errors if I reinstall 381 after setting up nvcc so I can't use nvcc.
I can import pygpu without errors but if I run pygpu.test() I get the following error and I don't know how to specify the DEVICE variable without nvcc.
======================================================================
ERROR: Failure: RuntimeError (No test device specified. Specify one using the DEVICE or GPUARRAY_TEST_DEVICE environment variables.)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/me/anaconda3/envs/py35/lib/python3.5/site-packages/nose/failure.py", line 39, in runTest
raise self.exc_val.with_traceback(self.tb)
File "/home/me/anaconda3/envs/py35/lib/python3.5/site-packages/nose/loader.py", line 418, in loadTestsFromName
addr.filename, addr.module)
File "/home/me/anaconda3/envs/py35/lib/python3.5/site-packages/nose/importer.py", line 47, in importFromPath
return self.importFromDir(dir_path, fqname)
File "/home/me/anaconda3/envs/py35/lib/python3.5/site-packages/nose/importer.py", line 94, in importFromDir
mod = load_module(part_fqname, fh, filename, desc)
File "/home/me/anaconda3/envs/py35/lib/python3.5/imp.py", line 234, in load_module
return load_source(name, filename, file)
File "/home/me/anaconda3/envs/py35/lib/python3.5/imp.py", line 172, in load_source
module = _load(spec)
File "<frozen importlib._bootstrap>", line 693, in _load
File "<frozen importlib._bootstrap>", line 673, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 665, in exec_module
File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
File "/home/me/.local/lib/python3.5/site-packages/pygpu-0.6.2-py3.5-linux-x86_64.egg/pygpu/tests/test_tools.py", line 5, in <module>
from .support import (guard_devsup, rand, check_flags, check_meta, check_all,
File "/home/me/.local/lib/python3.5/site-packages/pygpu-0.6.2-py3.5-linux-x86_64.egg/pygpu/tests/support.py", line 32, in <module>
context = gpuarray.init(get_env_dev())
File "/home/me/.local/lib/python3.5/site-packages/pygpu-0.6.2-py3.5-linux-x86_64.egg/pygpu/tests/support.py", line 29, in get_env_dev
raise RuntimeError("No test device specified. Specify one using the DEVICE or GPUARRAY_TEST_DEVICE environment variables.")
RuntimeError: No test device specified. Specify one using the DEVICE or GPUARRAY_TEST_DEVICE environment variables.
----------------------------------------------------------------------
Ran 7 tests in 0.003s
FAILED (errors=7)
<nose.result.TextTestResult run=7 errors=7 failures=0>
Warning: its entirely possible that this is all wrong and the actual reason for your problem is in fact - as you suspect - your gpu driver.
I had the same issue with gpuarray on Windows 10.
In the end I solved it by:
completely uninstall python
install cuda 8.0 (with cudnn 5.1)
install anaconda
install theano through anaconda:
conda install theano pygpu
As you are using linux: This error message basically means It didn't work, don't ask me why And is mostly shown if something with your setup is wrong (e.g. different compilers used for compiling python and theano, or incompatible cuda version)
I would recommend to update to cuda 8.0 and to reinstall your python environment over anaconda (just in case)
On a side note: I tested your example script from the docu and at least that is working....
Note for windows users: Never try to install Anaconda in a location where you have spaces in the path... Everything looks fine ... until theano starts having trouble finding and compiling things.
Note regarding the pygpu.test():
Normally you just set the environment variable:
windows: set DEVICE=cuda
linux: export DEVICE=cuda
BUT The test has the habit of saying you didn't specify a device if the library couldn't be loaded...