Tensorflow: with tf.device('/gpu:0') claims ALL GPUs

Tensorflow: with tf.device('/gpu:0') claims ALL GPUs - tensorflow

I'm trying to use only one GPU on a device with 8 GPUs.
Using the following line:
with tf.device('/gpu:0')
creates a device for all eight GPUs, but I only want GPU 0.
2022-07-19 12:46:48.690776: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9652 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5
2022-07-19 12:46:48.691211: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-19 12:46:48.691854: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 2413 MB memory: -> device: 1, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:02:00.0, compute capability: 7.5
2022-07-19 12:46:48.692299: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-19 12:46:48.692935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 9652 MB memory: -> device: 2, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:03:00.0, compute capability: 7.5
2022-07-19 12:46:48.693290: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-19 12:46:48.693924: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 9652 MB memory: -> device: 3, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:06:00.0, compute capability: 7.5
2022-07-19 12:46:48.694286: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-19 12:46:48.694921: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:4 with 9652 MB memory: -> device: 4, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:0a:00.0, compute capability: 7.5
2022-07-19 12:46:48.695272: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-19 12:46:48.695912: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:5 with 4585 MB memory: -> device: 5, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:0d:00.0, compute capability: 7.5
2022-07-19 12:46:48.697368: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-19 12:46:48.698061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:6 with 4717 MB memory: -> device: 6, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:0f:00.0, compute capability: 7.5
2022-07-19 12:46:48.698452: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-19 12:46:48.699093: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:7 with 9652 MB memory: -> device: 7, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:10:00.0, compute capability: 7.5
Besides that, I have also tried setting CUDA_VISIBLE_DEVICES before importing Tensorflow. It has no effect. And besides that, I tried to use tf.config.set_visible_devices(gpus[0], "GPU"), which crashes my program with a RuntimeError "Visible devices cannot be modified after being initialized".
This is on a machine that multiple people use, and I can't just use all GPUs. I tried about everything I found on this.

Related

Am I using tensorflow GPU?

I'm using TensorFlow-GPU 1.14 on Ubuntu 16.04.
As I'm not familiar with TensorFlow, I wonder I'm using GPU practically or not.
I have
GeForce GTX 1060
Nvidia-driver 418
CUDA 10.0
cuDNN v7.6.5
And when I execute my codes I always get this message,
WARNING:tensorflow:From /home/mine/catkin_ws/src/PROJECT/project6_3/src/ddpg.py:26: The name tf.InteractiveSession is deprecated. Please use tf.compat.v1.InteractiveSession instead.
2020-06-24 20:29:13.827441: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-06-24 20:29:13.834067: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-06-24 20:29:13.930412: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-24 20:29:13.931260: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x6a40a50 executing computations on platform CUDA. Devices:
2020-06-24 20:29:13.931277: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce GTX 1060 6GB, Compute Capability 6.1
2020-06-24 20:29:13.959129: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3000000000 Hz
2020-06-24 20:29:13.959392: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x6ab1d70 executing computations on platform Host. Devices:
2020-06-24 20:29:13.959409: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
2020-06-24 20:29:13.959576: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-24 20:29:13.960326: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.759
pciBusID: 0000:01:00.0
2020-06-24 20:29:13.961867: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-06-24 20:29:13.988711: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-06-24 20:29:14.002012: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-06-24 20:29:14.006381: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-06-24 20:29:14.038179: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-06-24 20:29:14.057922: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-06-24 20:29:14.114149: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-06-24 20:29:14.114248: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-24 20:29:14.115060: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-24 20:29:14.115765: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-06-24 20:29:14.116472: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-06-24 20:29:14.118350: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-24 20:29:14.118378: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2020-06-24 20:29:14.118386: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2020-06-24 20:29:14.119144: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-24 20:29:14.119963: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-24 20:29:14.120610: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4889 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
WARNING:tensorflow:From /home/mine/catkin_ws/src/PROJECT/project6_3/src/actor_network_bn.py:75: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /home/mine/catkin_ws/src/PROJECT/project6_3/src/actor_network_bn.py:177: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
WARNING:tensorflow:From /home/mine/catkin_ws/src/PROJECT/project6_3/src/actor_network_bn.py:178: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.
Anyone who knows what this message means?
Am I using GPU, properly?
(When I checked how much my GPU was being used with following commands
$ watch -d -n 0.5 nvidia-smi
it always returns 1407 Mib/ 6000 Mib of usage.)
And additionally, should I modify my codes following WARNING messages?
(it works well currently on some level)
Thanks in advance. :)

Am I using Tensorflow GPU ?
If you have executed below code and if it returns device_type='GPU' means, there is no issue with Tensorflow GPU installation and you are good to use.
import tensorflow as tf
tf.config.list_physical_devices('GPU')
Output:
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2020-06-24 20:29:14.120610: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created
TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with
4889 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060
6GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
If you have check above log from stack trace Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4889 MB memory), that means you are using GPU.
2020-06-24 20:29:13.961867: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-06-24 20:29:13.988711: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-06-24 20:29:14.002012: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-06-24 20:29:14.006381: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-06-24 20:29:14.038179: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
They are just the Information as they are prefixed with I. If there would be any warnings then it prefixed with W and for error they prefixed with E.
And you are seeing WARNING:tensorflow: they are conveying you to replace modules with newer one(i.e compat) since those are deprecated and to execute same code in TF2.x.

"CUDA_ERROR_INVALID_DEVICE: invalid device ordinal" when starting a TensorFlow session

When starting a TensorFlow session, the GPU is not detected (CUDA_ERROR_INVALID_DEVICE: invalid device ordinal):
$ CUDA_VISIBLE_DEVICES='0' python3 -c 'import tensorflow as tf; sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))'
2019-07-18 09:36:55.661519: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-18 09:36:55.684438: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3312000000 Hz
2019-07-18 09:36:55.684721: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x41adbb0 executing computations on platform Host. Devices:
2019-07-18 09:36:55.684750: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
2019-07-18 09:36:55.686513: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-07-18 09:36:55.696958: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_INVALID_DEVICE: invalid device ordinal
2019-07-18 09:36:55.697001: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: tobias-Z170-HD3P
2019-07-18 09:36:55.697006: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: tobias-Z170-HD3P
2019-07-18 09:36:55.697084: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 410.73.0
2019-07-18 09:36:55.697108: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 410.73.0
2019-07-18 09:36:55.697113: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:310] kernel version seems to match DSO: 410.73.0
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
2019-07-18 09:36:55.697380: I tensorflow/core/common_runtime/direct_session.cc:296] Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
Using CUDA_VISIBLE_DEVICES='1' instead also does not help.
Cuda (cuda_10.0.130_410.48_linux.run) is installed.
$ cat /usr/local/cuda/version.txt
CUDA Version 10.0.130
CuDNN (cudnn-10.0-linux-x64-v7.4.2.24.tgz) too:
$ cat /usr/include/x86_64-linux-gnu/cudnn_v*.h | grep CUDNN_MAJOR -A 2 | head -n 3
#define CUDNN_MAJOR 6
#define CUDNN_MINOR 0
#define CUDNN_PATCHLEVEL 21
TensorFlow (pip3 install tensorflow-gpu):
$ python3 -c 'import tensorflow as tf; print(tf.__version__)'
1.14.0
Nvidia driver (NVIDIA-Linux-x86_64-410.73.run) also:
$ nvidia-smi
Thu Jul 18 09:35:03 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.73 Driver Version: 410.73 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1070 Off | 00000000:01:00.0 On | N/A |
| 0% 45C P8 17W / 230W | 569MiB / 8111MiB | 19% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 2270 G /usr/lib/xorg/Xorg 301MiB |
| 0 3021 G /opt/zoom/zoom 14MiB |
| 0 3503 G ...-token=CB875E52FAB2279C6A34C6519188AD9C 71MiB |
| 0 3534 G ...uest-channel-token=16121978823314344450 56MiB |
| 0 3618 G ...uest-channel-token=12369473663213430887 52MiB |
| 0 4249 G ...uest-channel-token=13759302641460814281 62MiB |
| 0 4499 G ...uest-channel-token=10576172133955227583 7MiB |
+-----------------------------------------------------------------------------+
I'm using Linux Mint 18.2.
Any ideas?

Solved it. I uninstalled all Nvidia driver version showing up in the synaptic package manager, installed from NVIDIA-Linux-x86_64-410.73.run, and now it's OK.
For the record: Uninstalling using the command line might look as follows:
sudo nvidia-uninstall
sudo apt-get remove --purge nvidia-*
$ python3 -c 'import tensorflow as tf; sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))'
2019-07-18 10:57:07.020764: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-18 10:57:07.059271: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3312000000 Hz
2019-07-18 10:57:07.060038: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x53aec90 executing computations on platform Host. Devices:
2019-07-18 10:57:07.060060: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
2019-07-18 10:57:07.069543: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-07-18 10:57:07.216124: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:57:07.216596: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x54792a0 executing computations on platform CUDA. Devices:
2019-07-18 10:57:07.216612: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce GTX 1070, Compute Capability 6.1
2019-07-18 10:57:07.216803: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:57:07.217224: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7715
pciBusID: 0000:01:00.0
2019-07-18 10:57:07.218763: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-07-18 10:57:07.243155: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-07-18 10:57:07.257961: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-07-18 10:57:07.263297: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-07-18 10:57:07.298517: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-07-18 10:57:07.321558: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-07-18 10:57:07.394510: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-07-18 10:57:07.394806: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:57:07.396131: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:57:07.397206: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-07-18 10:57:07.397798: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-07-18 10:57:07.400997: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-18 10:57:07.401041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2019-07-18 10:57:07.401059: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2019-07-18 10:57:07.401572: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:57:07.402874: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:57:07.404129: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7060 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1
2019-07-18 10:57:07.405492: I tensorflow/core/common_runtime/direct_session.cc:296] Device mapping:
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1

No module named tensorflow after installation?

I installed tensorflow-gpu but I got error in Pycharm:
ModuleNotFoundError: No module named 'tensorflow'
I checked in terminal:
$ pip3 list|grep tensorflow
tensorflow-gpu 1.4.0
tensorflow-tensorboard 0.4.0
Edit: ( after installation using venv):
Successfully installed tensorflow-gpu-1.12.0
(venv) wojtek#wojtek-GF63-8RC:~$ python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))"
2018-12-17 21:49:14.893016: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-12-17 21:49:14.961123: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-17 21:49:14.961466: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce GTX 1050 major: 6 minor: 1 memoryClockRate(GHz): 1.493
pciBusID: 0000:01:00.0
totalMemory: 3.95GiB freeMemory: 3.58GiB
2018-12-17 21:49:14.961479: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2018-12-17 21:49:15.148507: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-17 21:49:15.148538: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2018-12-17 21:49:15.148544: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2018-12-17 21:49:15.148687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3306 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
tf.Tensor(918.94904, shape=(), dtype=float32)

You'll want to configure the interpreter src
1) In the Project Interpreters page, select one of the configured interpreters or virtual environments.
2) Click Edit.
3) In the Edit Python Interpreter dialog box that opens, type the desired interpreter name.
Changing interpreter's name
The Python interpreter name specified in the Name field, becomes visible in the list of available interpreters.
If necessary, change the path to the Python executable.

Your kernel may have been built without NUMA support

I have Jetson TX2, python 2.7, Tensorflow 1.5, CUDA 9.0
Tensorflow seems to be working but everytime, I run the program, I get this warning:
with tf.Session() as sess:
print (sess.run(y,feed_dict))
...
2018-08-07 18:07:53.200320: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:881] could not open file to read NUMA node: /sys/bus/pci/devices/0000:00:00.0/numa_node Your kernel may have been built without NUMA support.
2018-08-07 18:07:53.200427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: NVIDIA Tegra X2
major: 6
minor: 2
memoryClockRate(GHz): 1.3005
pciBusID: 0000:00:00.0
totalMemory: 7.66GiB
freeMemory: 1.79GiB
2018-08-07 18:07:53.200474: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
2018-08-07 18:07:53.878574: I tensorflow/core/common_runtime/gpu/gpu_device.cc:859] Could not identify NUMA node of /job:localhost/replica:0/task:0/device:GPU:0, defaulting to 0. Your kernel may not have been built with NUMA support.
Should I be worried? Or is it something negligible?

It shouldn't be a problem for you, since you don't need NUMA support for this board (it has only one memory controller, so memory accesses are uniform).
Also, I found this post on nvidia forum that seems to confirm this.

TF 1.7: Not found: TF GPU device with id 0 was not registered

I built TF 1.7 libraries from source.
When I use it with rust binding I get
2018-04-16 23:40:09.254248: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-04-16 23:40:09.254550: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties:
name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7465
pciBusID: 0000:01:00.0
totalMemory: 5.93GiB freeMemory: 4.95GiB
2018-04-16 23:40:09.254562: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-16 23:40:09.383859: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-16 23:40:09.383889: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917] 0
2018-04-16 23:40:09.383894: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0: N
2018-04-16 23:40:09.384066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4711 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-04-16 23:40:09.463369: E tensorflow/core/grappler/clusters/utils.cc:127] Not found: TF GPU device with id 0 was not registered
All look ok except the last line.
What is this the error? It doesn't appear in python.
Someone built TF 1.7 in Mac+python and got the same error: https://gist.github.com/pavelmalik/d51036d508c8753c86aed1f3ff1e6967

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Tensorflow: with tf.device('/gpu:0') claims ALL GPUs - tensorflow

Related

Am I using tensorflow GPU?

"CUDA_ERROR_INVALID_DEVICE: invalid device ordinal" when starting a TensorFlow session

No module named tensorflow after installation?

Your kernel may have been built without NUMA support

TF 1.7: Not found: TF GPU device with id 0 was not registered

Categories

Resources