tensorflow-gpu recognizes XLA-CPU instead of GPU - tensorflow

I am trying to install keras-gpu on PC with Tesla V100 and Windows Server 2019. I installed some version (2.4.3) and found that my GPU is not working. I need to install any 2.x.x version of keras with GPU support.
I have installed CUDA 10.1 cudnn 8.0.5 and after many attempts also tried 11.2 version with cudnn 8.1.1 (Also tried 11.5). And started searching version of tensorflow which can find my GPU.
for 10.1:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:12:52_Pacific_Daylight_Time_2019
Cuda compilation tools, release 10.1, V10.1.243
I am using this code to check all:
import tensorflow
print(tensorflow.__version__)
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
My output:
2021-11-06 10:39:16.326880: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2.3.0
2021-11-06 10:39:21.177512: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-06 10:39:21.208333: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x25d395509b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-11-06 10:39:21.217997: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-11-06 10:39:21.261861: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2021-11-06 10:39:21.677227: E tensorflow/stream_executor/cuda/cuda_driver.cc:314] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2021-11-06 10:39:21.692028: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: windows-freqgpu
2021-11-06 10:39:21.700398: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: windows-freqgpu
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 881354854201867138
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 5868137251793075209
physical_device_desc: "device: XLA_CPU device"
]
Tesla V100 here is XLA_CPU. how to fix this?

You could try installing tensorflow-gpu 2.2.x or 2.3.x which are compatible with CUDA 10.1, as can be checked in the tested build configurations below:
https://www.tensorflow.org/install/source#gpu
If you look at tested build configurations, you will see that tensorflow 2.4.0 is tested for CUDA 11.0. Looking at software requirements on tensorflow GPU support page (https://www.tensorflow.org/install/gpu#software_requirements) you can see that CUDA 11.2 seems to be recommended only for Tensorflow >= 2.5.0.
It is unlikely that your GPU is recognized as a 'XLA_CPU' device. Here 'XLA' stands for 'accelerated linear algebra' (https://www.tensorflow.org/xla). It's a domain specific compiler that can be used both for CPUs and GPUs. For more details you could take a look at this what is XLA_GPU and XLA_CPU for tensorflow. It is more likely that your GPU is simply not detected, as evidenced by this line in your output.
2021-11-06 10:39:21.677227: E tensorflow/stream_executor/cuda/cuda_driver.cc:314] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

As was mentioned by #talonmies it was driver related problem. To be more precise Tesla driver related problem. I had updated driver, but Tesla requires specific versions of driver for different CUDA versions.
Also for common GPUs CUDA brings correct driver itself.
Correct installation for Tesla v100/Windows Server 2019/CUDA 10.1:
Install CUDA (10.1 in my case)
install driver which fits this CUDA version (427.60)
Install cuDNN (7.6.5)

Related

"successfully opened CUDA library" not showing while importing tensorflow

I have installed Cuda 11.2 and cuDNN 8.1 and TensorFlow 2.9.1 in my local system.
C:\Users\my>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:15:10_Pacific_Standard_Time_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0
tf.print(tf. __version__)
print(tf.test.is_built_with_cuda())
print(tf.config.list_physical_devices('GPU'))
2.9.1
True
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
And environment variable is also set correctly set(Screen shot atteched).Environment variables
However, when I do import TensorFlow, I don't see "successfully opened CUDA library" messages
The following log is being printed in jupyter noteboook.
2022-08-13 12:37:34.012384: I TensorFlow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-13 12:37:35.492829: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created
device /job:localhost/replica:0/task:0/device:GPU:0 with 2153 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1650, pci bus id: 0000:01:00.0, compute capability: 7.5

Not showing GPU, Installed tensroflow using anaconda

I installed tensorflow using below
conda create -n gpu_env tensorflow-gpu
conda activate gpu_env
and try to check gpu with this below code:
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
output shows me only cpu.
2021-04-18 20:54:47.012684: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 9318913720310627740
]
What should i do?
Note that tensorflow-gpu installs tensorflow v2.3.0 and currently, it does not install the conda cudnn or cudatoolkit packages. One thing you can do is install an earlier version of tensorflow, which does install cudnn and cudatoolkit, then upgrade with pip.
conda install tensorflow-gpu=2.1
pip install tensorflow-gpu==2.3.1
The tensorflow build automatically selected by Anaconda on Windows 10 during the installation of tensorflow-gpu seems to be faulty so check this workaround.

Non-OK-status: GpuLaunchKernel(...) status: Internal: no kernel image is available for execution on the device

I run my code on tensorflow 2.1.0 Anaconda with CUDA Toolkit 10.1 CUDNN 7.6.0 (Windows 10) and it returns a issue
F .\tensorflow/core/kernels/random_op_gpu.h:232] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: no kernel image is available for execution on the device
My GPU : GT940MX Compute Capability 5.0
I already run the nvcc -V and it returns :
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:26_Pacific_Standard_Time_2019
Cuda compilation tools, release 10.1, V10.1.105
This is the full result :
2020-08-05 10:05:48.368012: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 10:06:00.488544: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-08-05 10:06:48.153611: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 940MX computeCapability: 5.0
coreClock: 0.8605GHz coreCount: 4 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 37.33GiB/s
2020-08-05 10:06:48.164731: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 10:06:48.245826: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-08-05 10:06:48.296245: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-08-05 10:06:48.338860: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-08-05 10:06:48.439393: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-08-05 10:06:48.489830: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-08-05 10:06:48.941872: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-08-05 10:06:48.946651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-08-05 10:06:48.951881: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-05 10:06:48.979077: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x23d29b660d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-05 10:06:48.985680: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-08-05 10:06:48.990616: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 940MX computeCapability: 5.0
coreClock: 0.8605GHz coreCount: 4 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 37.33GiB/s
2020-08-05 10:06:49.003356: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-08-05 10:06:49.009869: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-08-05 10:06:49.014858: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-08-05 10:06:49.020699: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-08-05 10:06:49.028876: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-08-05 10:06:49.033607: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-08-05 10:06:49.039192: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-08-05 10:06:49.045288: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-08-05 10:06:49.218497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-08-05 10:06:49.223536: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-08-05 10:06:49.226857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2020-08-05 10:06:49.230413: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1460 MB memory) -> physical GPU (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute capability: 5.0)
2020-08-05 10:06:49.244107: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x23d301b8fa0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-08-05 10:06:49.250377: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce 940MX, Compute Capability 5.0
2020-08-05 10:06:49.446601: F .\tensorflow/core/kernels/random_op_gpu.h:232] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: no kernel image is available for execution on the device
What are the issues and how to fix it?
Looks like this is an issue with Python 3.8 and Tensorflow 2.3. I tried the tensorflow 2.3.0 with python 3.7, but it returns an error with python 3.7 because python38.dll (I don't remember exactly the error and i already delete the env), anyway i used python 3.7 on anaconda env and installed tensorflow 2.1.0 with pip and it works.
I posted the question too in github and this question is answered in github https://github.com/tensorflow/tensorflow/issues/42052
As per the screenshot below, Tensorflow Versions 2.1, 2.2 and 2.3 work with cuDNN version of 7.4 but cuDNN version of your GPU is 7.6.
That might be, most probably, the reason for the error.
Solution is to downgrade cuDNN Version of your GPU.
Existing Version of cuDNN can be uninstalled through the Windows Control Panel by using the Programs and Features widget.
New Version of cuDNN can be installed as shown in this NVIDIA Installation Guide.
Also, please refer this Github Issue to know more on how to downgrade cuDNN Version.
The above screenshot has been taken from this Tensorflow Documentation.
I have the same problem, my cuDNN is 8.0.2.
As you say, there is no cuDNN 7.4 for CUDA 10.1.
So, I tried the cuDNN 7.5 for CUDA 10.1 and it works!!!!
Hope my experience can help someone else. :)
Seem like some cuDNN only support by some specific versions of tensorflow.
As a Window user this is how I do:
Check which version that which Tensorflow and CUDA version combinations are compatible (you can click on other OS on the left)
As Rock Jefferson comment you can use cuDNN 7.5 for CUDA 10.1. It work for me.
Download here
Try it. Hope it useful for U guy.

The kernel appears to have died. It will restart automatically tensorflow2.0

CUDA Version: 10.0 tensorflow 2.0.0-alpha0 jupyter notebook
python 3.5.6 cuda toolkit 10.0 ubuntu 18.10
define CUDNN_MAJOR 7 define CUDNN_MINOR 5 define
CUDNN_PATCHLEVEL 1 define CUDNN_VERSION (CUDNN_MAJOR * 1000 +
CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL) include "driver_types.h"
The kernel appears to have died. It will restart automatically.
strange thing is cudnn version I installed is 7.5.1.
But error message is keep saying that Loaded runtime CuDNN library: 7.3.1 but source was compiled with: 7.4.2. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. I have no idea why 7.3.1 cudnn library came from.
This message come out after I installed '2.0.0-alpha0' tensorflow.
name: TITAN Xp major: 6 minor: 1 memoryClockRate(GHz): 1.582 pciBusID:
0000:42:00.0 totalMemory: 11.90GiB freeMemory: 11.10GiB 2019-05-14
08:03:49.028136: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1546] Adding visible
gpu devices: 0 2019-05-14 08:03:49.028424: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42]
Successfully opened dynamic library libcudart.so.10.0 2019-05-14
08:03:49.029904: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1015] Device
interconnect StreamExecutor with strength 1 edge matrix: 2019-05-14
08:03:49.029928: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1021] 0
2019-05-14 08:03:49.029938: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1034] 0: N
2019-05-14 08:03:49.030490: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1149] Created
TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with
10796 MB memory) -> physical GPU (device: 0, name: TITAN Xp, pci bus
id: 0000:42:00.0, compute capability: 6.1) 2019-05-14 08:03:57.812557:
I tensorflow/stream_executor/platform/default/dso_loader.cc:42]
Successfully opened dynamic library libcublas.so.10.0 2019-05-14
08:03:58.069834: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42]
Successfully opened dynamic library libcudnn.so.7 2019-05-14
08:03:58.730756: E tensorflow/stream_executor/cuda/cuda_dnn.cc:328]
Loaded runtime CuDNN library: 7.3.1 but source was compiled with:
7.4.2. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a
binary install, upgrade your CuDNN library. If building from sources,
make sure the library loaded at runtime is compatible with the version
specified during compile configuration. 2019-05-14 08:03:58.732680: E
tensorflow/stream_executor/cuda/cuda_dnn.cc:328] Loaded runtime CuDNN
library: 7.3.1 but source was compiled with: 7.4.2. CuDNN library
major and minor version needs to match or have higher minor version in
case of CuDNN 7.0 or later version. If using a binary install, upgrade
your CuDNN library. If building from sources, make sure the library
loaded at runtime is compatible with the version specified during
compile configuration. 2019-05-14 08:03:58.733465: F
tensorflow/core/kernels/conv_grad_input_ops.cc:955] Check failed:
stream->parent()->GetConvolveBackwardDataAlgorithms(
conv_parameters.ShouldIncludeWinogradNonfusedAlgo(stream->parent()),
&algorithms)
This is the main message when kernel died
question: jupyter notebook restart with kernel died. with above message.
Need to fix it.

With multiple versions of CUDA installed, how can I make Tensorflow-GPU use a specific version of CUDA on Windows

I currently have two versions of CUDA installed on my computer: 9.0 and 10.0. I have some Python modules that require CUDA 9.0 and some that require 10.0. For example, the version of Tensorflow-GPU I use requires CUDA 10.0. When I try to start training, I get the following error message:
2019-05-23 10:59:35.911847: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2019-05-23 10:59:39.907756: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: Tesla V100-PCIE-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.38
pciBusID: 0000:84:00.0
totalMemory: 15.90GiB freeMemory: 14.98GiB
2019-05-23 10:59:39.919434: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
Traceback (most recent call last):
File "wider_faces_inference.py", line 137, in <module>
output_dict_array = run_inference_for_images(image_np_list, detection_graph)
File "wider_faces_inference.py", line 74, in run_inference_for_images
with tf.Session() as sess:
File "C:\ProgramData\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\client\session.py", line 1551, in __init__
super(Session, self).__init__(target, graph, config=config)
File "C:\ProgramData\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\client\session.py", line 676, in __init__
self._session = tf_session.TF_NewSessionRef(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version
This I believe is because tensorflow is not looking for the right version of CUDA. I wonder how I can make tensorflow use the correct version of CUDA.
EDIT:
To add a bit more information:
The version of Tensorflow I installed was compiled against CUDA 10.0. I installed CUDA 10.0 and Tensorflow-GPU first, and tensorflow worked just fine. Then I installed CUDA 9.0, and after installation, tensorflow stopped working.
Each version of CUDA comes with a driver that you can choose to install; newer versions of NVidia drivers support older versions of CUDA, but the reverse isn't true. The driver that comes with CUDA 9.0 is not able to run CUDA 10.0 applications.
All that you need to do is install the latest NVidia driver (or generally, any NVidia driver that has been released since CUDA 10.0) in order to have support for CUDA 9.X and 10.0 applications. The path of least resistance might be to reinstall the driver that came with CUDA 10.0, but you should get the most recent driver regardless.