I recently installed TensorFlow (2.3.1) with CUDA 11.1.0 cuDNN 8.0.4 In many forums, they said cuDNN 11.1 is backwards compatible with the previous versions and I also set the PATH variable as mentioned in TensorFlow installation guide, yet I still get the warning
2020-10-05 13:55:42.704300: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-10-05 13:55:42.706817: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
But I do have Nvidia GeForce RTX 2070 Super card. How to fix this issue?
And I am using python 3.8
Thanks in advance!
According to Tensorflow tested build configurations,CUDA 11.0 can be used for TF 2.4.0
For TF 2.3.0, compatible CUDA is 10.1, cuDNN is 7.6.
For more details please refer here.
Related
I am using a package that requires tensorflow-gpu == 2.0.0 and CUDA=10.0.0 with cudann==7.6.0
I am running this code on a NVIDIA gpu cluster and when I run nvidia-smi it shows
this. It still shows cuda 11, which I guess is the one installed on the actually server.
I was told that I can basically 'override' this version by installing the cudatoolkit in the version that I need. I did that and installed cudatoolkit==10.0.
Unfortunately I am now running into a problem when trying to run an LSTM based model with tensorflow-gpu. I get the following:
2022-06-14 17:02:26.988359: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989175: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989208: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
In the path I still see cuda11. Is this causing the problem? How can I resolve this?
As you mentioned in the comment you need to use TensorFlow 2.1, then you need to install cuDNN 7.6 and CUDA 10.1 specifically.
Please follow the below tested build configurations to know about Python and TensorFlow versions compatible CUDA and cuDNN .
Please check this link for more details on GPU setup.
Machine Setting:
GPU: GeForce RTX 3060
Driver Version: 460.73.01
CUDA Driver Veresion: 11.2
Tensorflow: tensorflow-gpu 1.14.0
CUDA Runtime Version: 10.0
cudnn: 7.4.1
Note:
CUDA Runtime and cudnn version fits the guide from Tensorflow official documentation.
I've also tried for TensorFlow-gpu = 2.0, still the same problem.
Problem:
I am using Tensorflow for an objection detection task. My situation is that the program will stuck at
2021-06-05 12:16:54.099778: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10
for several minutes.
And then stuck at next loading process
2021-06-05 12:21:22.212818: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
for even longer time. You may check log.txt for log details.
After waiting for around 30 mins, the program will start to running and WORK WELL.
However, whenever program invoke self.session.run(...), it will load the same two library related to cuda (libcublas and libcudnn) again, which is time-wasted and annoying.
I am confused that where the problem comes from and how to resolve it. Anyone could help?
Discussion Issue on Github
===================================
Update
After #talonmies 's help, the problem was resolved by resetting the environment with correct version matching among GPU, CUDA, cudnn and tensorflow. Now it works smoothly.
Generally, if there are any incompatibility between TF, CUDA and cuDNN version you can observed this behavior.
For GeForce RTX 3060, support starts from CUDA 11.x. Once you upgrade to TF2.4 or TF2.5 your issue will be resolved.
For the benefit of community providing tested built configuration
CUDA Support Matrix
I have seen the list of tested TF version and it's CUDA version compatibility Here, but I am having a doubt whether every TF version can be build from source with any CUDA version or not. For example:- can TF 1.14 build with CUDA 11.1 or not?
No, T.F. 1x is not compatible with CUDA 11.1. For CUDA 11.1 you need to go for T.F. 2.3 or higher.
If you wish to use T.F. 1.x then the best will be to use CUDA 10.0-10.2
while importing tensorflow
Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-08-28 00:21:19.206030: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
system
Hp 245 g5 notebook
operating system ubuntu 18.4
How to solve the problem?
It seems you are trying to use the TensorFlow-GPU version and you have downloaded conflicting software versions for it.
Note: GPU support is available for Ubuntu and Windows with CUDA enabled cards only.
If you have a Cuda enabled card follow the instructions provided below.
As stated in Tensorflow documentation. The software requirements are as follows.
Nvidia gpu drivers - 418.x or higher
Cuda - 10.1 (TensorFlow >= 2.1.0)
cuDNN - 7.6
Make sure you have these exact versions of the software mentioned above. See this
Also, check the system requirements here.
For downloading the software mentioned above see here.
For downloading TensorFlow follow the instructions provided here to correctly install the necessary packages.
I have a machine with cuda 9.0 and cudnn 7.1.
I've tried using tensorflow 1.7.0 on this machine but it does not work since this version of tensorflow has been created for cudnn 7.0
I'm getting this error when launching a training on my gpu:
Loaded runtime CuDNN library: 7102 (compatibility version 7100) but source was compiled with 7005 (compatibility version 7000).
Is there a tensorflow version that is compatible with my cuda and cudnn versions? I also need this working tensorflow version to be >=1.7.0.
I have googled this, searched every question but I never got answers for these particular versions of cuda and cudnn.
This should be possible with tensorflow_gpu-1.9.0. Linked below is a table which displays compatibilities of CUDA and cuDNN with varying versions of tensorflow.
https://www.tensorflow.org/install/install_sources#tested_source_configurations
Ok, seems I missed some installation steps.
By installing the last version of tensorflow, which at the time of writing is 1.9.0, it did work on my machine.