Can CUDA 10.0 and 10.1 be on the same system? - tensorflow

I want support for both Visual Studio 2019 (which needs CUDA 10.1) and TensorFlow-GPU 1.14 (which needs CUDA 10.0) on a Windows PC. Is there any methods?
I simply installed CUDA 10.0 and CUDA 10.1, and add both directory into environment variable CUDA_PATH. cuDNN is already installed.
The result is Visual Studio can detect CUDA but TensorFlow cannot.

Yes, more than one version of the CUDA toolkit can exist on a system and be used by different applications.
How are you installing TensorFlow-GPU? If you're compiling it yourself, during configuration you can specify the path to whichever version of CUDA that you want to use. If you're installing a pre-built set of binaries (e.g. using something like Anaconda) then that's already been built against a specific version of the CUDA toolkit; you'll need to fetch a different version of the binaries compiled for whichever CUDA toolkit you want, or build it yourself.
If you use Anaconda to install TensorFlow-GPU, you should also receive the correct version of the CUDA toolkit that's needed to run whichever version of TensorFlow-GPU that you've installed; it takes care of those dependencies for you.

Related

Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0

I get this error when I run the model.fit_generator code to train images using the CNN model. I don't understand the error, and what should I do? Can anyone help me?
this is the full error description
`Loaded runtime CuDNN library: 8.0.5, but the source was compiled with: 8.1.0. CuDNN library needs to have a matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, ensure the library loaded at runtime is compatible with the version specified during compile configuration.
I had the same error "tensorflow/stream_executor/cuda/cuda_dnn.cc:362] Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0."
I solved it by downgrading the TensorFlow version, here it says that you use a new version of TensorFlow that is not compatible with the google colab CuDNN version. I used TensorFlow 2.4.0 plus all the dependence required on version 2.4.0.
Here it says which version of TensorFlow to use for cudnn compatibility, https://www.tensorflow.org/install/source
You should always have version of libraries installed that is matching the version dependency you want to use is compiled with.
You can download the version you need from nvidia website or use conda for package management. It will handle all dependencies for you.
You can miniconda and type conda install -c anaconda tensorflow-gpu to get it sorted for you. If you need a specific version of python, you can create environment with it.
My solution:
After confirming that my cuda and cudnn versions are compatible with tensorflow, I first thought that the system did not synchronize after the installation was completed. After several restarts, it was found that it was not and could not be the problem, so I started to check all the cuda in the system. For the software that depends on cudnn, matlab was uninstalled during the period but it was useless. Later, I thought that pytorch is also related to cuda and cudnn. I checked the version of pytorch and found that I was using torch 1.8, and the cuda it was adapted to was 11.1 , The corresponding cudnn is 8.0.5, now the case is solved. Finally upgraded pytorch and solved it.
I have faced the same issue. It seems like if TensorFlow versions requires specific cuDNN version.
Check the link for required versions.
https://www.tensorflow.org/install/source#gpu
Thanks for This answer.
My solution:
After confirming that my cuda and cudnn versions are compatible with
tensorflow, I first thought that the system ...
It helps me a lot,but I use different way to solve this problem.
I found that pytorch 1.8 is compatible with cudnn 8.1.0. So, instead of upgrade pytorch version, I overwrite the cudnn 8.0.5 dll library with cudnn 8.1.0 in directory D:\Program Files\Python37\Lib\site-packages\torch\lib. You can find this location with Everything, which is always helpful.

Installation issue of CUDA and cuDNN on Windows

I am checking the CUDA and cuDNN installation on a system, and have several observations:
CUDA has two versions, 9.0 and 11.2
cuDNN was only found in the installation directory of CUDA 9.0
In the directory of CUDA 9.0, it has cudafe.exe while the directory of CUDA 11.2 does not have
In accordance with the current scenario, do I have to uninstall one of CUDA versions to avoid conflict?
You can have both versions together. However, you can only use one of them a time. For cuDNN, you will need to download it from here for CUDA 11.2, and put the files in "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2". Finally, make sure you set the system path to the desired version. For example, if you want CUDA 11.2, then open "Environment Variables", and make sure that both "CUDA_PATH" and "CUDNN" are "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2". Also it must inside the variable "Path".

Tensorflow can find right cudnn in one python file but fail in another

I am trying to use tensorflow gpu version to train and test my deep learning model. But here comes the problem. When I train my model in one python file things go on well. Tensorflow-gpu can be used properly. Then I save my model as a pretrained on as grapg.pb format and try to reuse it in another python file.
Then I got the following error messages.
E tensorflow/stream_executor/cuda/cuda_dnn.cc:363] Loaded runtime CuDNN
library: 7.1.4 but source was compiled with: 7.2.1. CuDNN library major
and minor version needs to match or have higher minor version in case of
CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN
library. If building from sources, make sure the library loaded at runtime
is compatible with the version specified during compile configuration.
I checked my cudnn version, in fact it is version 7.4.2. I also checked my environment path settings /cuda/v9.0/bin, cuda/v9.0/lib/x64, /cuda/v9.0/include are in there.
So why this happens? And how can I solve this?
--
cuda:v9.0
cudnn:7.4.2 (I think, I copy those cudnn files manually)
windows 10
python: 3.5
If you have multiple CuDNN installed thorough various ways like anaconda module and windows installation, you need to remove the older version in order for your code to detect the latest version and reinstall tensorflow-gpu.
You can follow this guide for installation based on OS.

Tensorflow 1.11 needs CuDNN 7.2 for CUDA 9.0, but there is no such library

The requirements of the current version of tensorflow 1.11 to run on GPU are
CUDA® Toolkit —TensorFlow supports CUDA 9.0.
cuDNN SDK (>= 7.2)
However the CuDNN downlad page only lists
Download cuDNN v7.2.1 (August 7, 2018), for CUDA 9.2
Given that CuDNN comes with different binaries for minor revisions of the CUDA toolkit (e.g. CuDNN 7.1.3 has one binary for CUDA 9.1 and another for CUDA 9.0), I suppose that this binary of CuDNN 7.2 is not compatible with CUDA 9.0.
Is is a documentation bug? If not, how to fullfill the requirements of TF 1.11?
I found the below answer by modifying the addresses from the publicly available libraries: https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.2.1/prod/9.0_20180806/cudnn-9.0-windows10-x64-v7.2.1.38
As #emilyfy suggested, addresses for other, hosted but not published versions and OSs can also be acquired.
Go to this page instead. https://developer.nvidia.com/rdp/cudnn-download
It has the link for
Download cuDNN v7.3.0 (Sept 19, 2018), for CUDA 9.0
cuDNN v7.2.1 for CUDA 9.0 used to be there but now that they have v7.3.0 it's not in the archives anymore. I'm having the same problems too with a model I built on another PC. Luckily I hadn't deleted the installers. I'll share them (only the deb installers for Linux) here.

Support for Nvidia CUDA Toolkit 9.2

What is the reasoning that Tensorflow-gpu is bound to a specific version of Nvidia's CUDA Toolkit? The current version appears to look for 9.0 specifically and will not work with anything greater. For example I installed the latest Toolkit 9.2 and added it to path but Tensorflow-gpu will not work with it and complains that it is looking for 9.0.
I can see major version updates not being supported but a minor release?
That's a good question. According to NVidia's website,
The CUDA driver is backward compatible, meaning that applications compiled against a particular version of the CUDA will continue to work on subsequent (later) driver releases.
So technically, it should not be a problem to support later iterations of a CUDA driver. And in practice, you will find working non-official pre-built binaries with later versions of CUDA and CuDNN on the net [1], [2]. Even easier to install, the tensorflow-gpu package installed from conda currently comes bundled with CUDA 9.2.
When asked on the topic, a dev answered,
The answer to why is driver issues in the ones required by 9.1, not many new features we need in cuda 9.1, and a few more minor issues.
So the reason looks rather vague -- he might mean that CUDA 9.1 (and 9.2) requires graphics card driver that are perhaps a bit too recent to be really convenient, but that is an uneducated guess.
If NVidia is right about binary compatibility, you may try to simply rename or link your CUDA 9.2 library as a CUDA 9.0 library and it should work. But I would save all my work before attempting this... and the fact that people go as far as recompiling tensorflow to support later CUDA versions may be a hint on how this could end.
When you download TF, you download a pre-built binary file.
In the build process TF is hard linked into a specific version of Cuda, so you cannot use it with different cuda versions.
If you want to work with the new (or sometimes older) version of cuda you will need to install TF from source (check how here)
Or, if you realy don't want to build yourself, check in these repos, there are others that publish specific TF binaries, few examples:
https://github.com/mind/wheels
https://github.com/yaroslavvb/tensorflow-community-wheels
https://github.com/fo40225/tensorflow-windows-wheel
For your convenience I add here the CUDA + cuDNN versions that are required for each prebuilt Tensorflow version:
(I write here just about the TF versions that I worked with, maybe older TF versions use older versions of CUDA as well)
before TF v1.5 cuda 8.0 and cuDNN 6
start from: 1.5 - Prebuilt binaries are now built against CUDA 9 and cuDNN 7.
The issue is not with NVIDIA drivers but Tensorflow itself. I spent an hour trying to make it work, and finally realized that if you download the pre-built binary from googleapi.com, it is hard coded to load libcudart.so.9.0! If you have both cuda 9.0 and 9.2 installed, tensorflow will work (but it's actually loading the dynamic libraries from 9.0). (BTW, I installed TF using anaconda.)
A cleaner approach is to build TF from source. It's not too complicated.