How to set up different versions of CUDA in one OS?
Here is my problem: Lastest Tensorflow with GPU support requires CUDA 11.2, whereas Pytorch works with 11.3. So what is the solution to install both libraries in Windows and Ubuntu?
One solution is to use Docker Container Environment, which would only need the Nvidia Driver to be of version XYZ.AB; in this way, you can use both PyTorch and TensorFlow versions.
A very good starting point for your problem would be this one(ML-WORKSPACE) : https://github.com/ml-tooling/ml-workspace
Related
I get this error when I run the model.fit_generator code to train images using the CNN model. I don't understand the error, and what should I do? Can anyone help me?
this is the full error description
`Loaded runtime CuDNN library: 8.0.5, but the source was compiled with: 8.1.0. CuDNN library needs to have a matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, ensure the library loaded at runtime is compatible with the version specified during compile configuration.
I had the same error "tensorflow/stream_executor/cuda/cuda_dnn.cc:362] Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0."
I solved it by downgrading the TensorFlow version, here it says that you use a new version of TensorFlow that is not compatible with the google colab CuDNN version. I used TensorFlow 2.4.0 plus all the dependence required on version 2.4.0.
Here it says which version of TensorFlow to use for cudnn compatibility, https://www.tensorflow.org/install/source
You should always have version of libraries installed that is matching the version dependency you want to use is compiled with.
You can download the version you need from nvidia website or use conda for package management. It will handle all dependencies for you.
You can miniconda and type conda install -c anaconda tensorflow-gpu to get it sorted for you. If you need a specific version of python, you can create environment with it.
My solution:
After confirming that my cuda and cudnn versions are compatible with tensorflow, I first thought that the system did not synchronize after the installation was completed. After several restarts, it was found that it was not and could not be the problem, so I started to check all the cuda in the system. For the software that depends on cudnn, matlab was uninstalled during the period but it was useless. Later, I thought that pytorch is also related to cuda and cudnn. I checked the version of pytorch and found that I was using torch 1.8, and the cuda it was adapted to was 11.1 , The corresponding cudnn is 8.0.5, now the case is solved. Finally upgraded pytorch and solved it.
I have faced the same issue. It seems like if TensorFlow versions requires specific cuDNN version.
Check the link for required versions.
https://www.tensorflow.org/install/source#gpu
Thanks for This answer.
My solution:
After confirming that my cuda and cudnn versions are compatible with
tensorflow, I first thought that the system ...
It helps me a lot,but I use different way to solve this problem.
I found that pytorch 1.8 is compatible with cudnn 8.1.0. So, instead of upgrade pytorch version, I overwrite the cudnn 8.0.5 dll library with cudnn 8.1.0 in directory D:\Program Files\Python37\Lib\site-packages\torch\lib. You can find this location with Everything, which is always helpful.
It is now Oct 29, 2018
After much googling, I have not found a definitive answer or any examples of people using the latest cuda10 for tensorflow on ubuntu 14.04.
My dilemma is whether to upgrade my OS (currently at 14.04) in order to run cuda9 so I can use the latest tensorflow version or use CUDA10 on my existing 14.04 install.
Note cuda9 does not support 14.04, however, Nvidia has indicated that 14.04 will be supported for cuda10.
So, any examples/experiences of people using tensorflow with cuda10 on ubuntu14.04 are keenly sought after!
Also note cuda10 is not specifically supported by tensorflow...yet...they say "soon". But TF can be built from source with cuda10.
This is a link for cuda10+tensorflow on ubuntu16.04:
https://github.com/tensorflow/tensorflow/issues/22706
The short answer, I realize, is "try building it myself". Before I do that, I thought I'd ask around. Thanks.
I don't know whether CUDA 10 can work well on Ubuntu 14.04, but I was managed to build TensorFlow with CUDA 10 on Ubuntu 18.04 with using NVIDIA released docker image.
You can pull the 'TensorFlow Release 18.09' and try it on your current system.
If the previous step does not work, consider upgrading your OS to 18.04.
I wrote down my installation experience on this page, you could read it for some detail if you need.
I am currently working as a working student and now I have trouble installing Tensorflow-gpu on a machine using a Nvidia Quadro GV100 GPU.
On the Tensorflow homepage I found out that I need to install CUDA 9.0 and Cudnn 7.x in order to run Tensorflow-gpu 1.9. The problem is that I can't find a suitable CUDA version supporting the GV100. Could it be that there is no CUDA version yet? Is it possible that one can't use the GV100 for tensoflow-gpu?
Sorry for the stupid question, I am new to installing DL frameworks :-)
Thank you very much for your help!
On the Tensorflow homepage I found out that I need to install CUDA 9.0 and Cudnn 7.x in order to run Tensorflow-gpu 1.9.
That is if you want to install a pre-built Tensorflow binary distribution. In that case you need to use the version of CUDA which the Tensorflow binaries were built against, which in this case in CUDA 9.0
The problem is that I can't find a suitable CUDA version supporting the GV100
The CUDA 9.0 and later toolkits fully support Volta cards and that should include the Quadro GV100. The driver which ships with CUDA 9.0 is a 384 series which won't support your GPU. If you are referring to a driver support issue, then the solution would be to install the recommended driver for your GPU, and only install the CUDA toolkit from the CUDA 9.0 bundle, not the toolkit and driver, which is the default.
Otherwise you can use CUDA 9.1 or 9.2, which should have support for your GPU with their supplied drivers, but you will then need to build Tensorflow yourself from source.
What is the reasoning that Tensorflow-gpu is bound to a specific version of Nvidia's CUDA Toolkit? The current version appears to look for 9.0 specifically and will not work with anything greater. For example I installed the latest Toolkit 9.2 and added it to path but Tensorflow-gpu will not work with it and complains that it is looking for 9.0.
I can see major version updates not being supported but a minor release?
That's a good question. According to NVidia's website,
The CUDA driver is backward compatible, meaning that applications compiled against a particular version of the CUDA will continue to work on subsequent (later) driver releases.
So technically, it should not be a problem to support later iterations of a CUDA driver. And in practice, you will find working non-official pre-built binaries with later versions of CUDA and CuDNN on the net [1], [2]. Even easier to install, the tensorflow-gpu package installed from conda currently comes bundled with CUDA 9.2.
When asked on the topic, a dev answered,
The answer to why is driver issues in the ones required by 9.1, not many new features we need in cuda 9.1, and a few more minor issues.
So the reason looks rather vague -- he might mean that CUDA 9.1 (and 9.2) requires graphics card driver that are perhaps a bit too recent to be really convenient, but that is an uneducated guess.
If NVidia is right about binary compatibility, you may try to simply rename or link your CUDA 9.2 library as a CUDA 9.0 library and it should work. But I would save all my work before attempting this... and the fact that people go as far as recompiling tensorflow to support later CUDA versions may be a hint on how this could end.
When you download TF, you download a pre-built binary file.
In the build process TF is hard linked into a specific version of Cuda, so you cannot use it with different cuda versions.
If you want to work with the new (or sometimes older) version of cuda you will need to install TF from source (check how here)
Or, if you realy don't want to build yourself, check in these repos, there are others that publish specific TF binaries, few examples:
https://github.com/mind/wheels
https://github.com/yaroslavvb/tensorflow-community-wheels
https://github.com/fo40225/tensorflow-windows-wheel
For your convenience I add here the CUDA + cuDNN versions that are required for each prebuilt Tensorflow version:
(I write here just about the TF versions that I worked with, maybe older TF versions use older versions of CUDA as well)
before TF v1.5 cuda 8.0 and cuDNN 6
start from: 1.5 - Prebuilt binaries are now built against CUDA 9 and cuDNN 7.
The issue is not with NVIDIA drivers but Tensorflow itself. I spent an hour trying to make it work, and finally realized that if you download the pre-built binary from googleapi.com, it is hard coded to load libcudart.so.9.0! If you have both cuda 9.0 and 9.2 installed, tensorflow will work (but it's actually loading the dynamic libraries from 9.0). (BTW, I installed TF using anaconda.)
A cleaner approach is to build TF from source. It's not too complicated.
My MacBook Pro doesn't have a NVIDIA gpu. So it's not possible to run CUDA. I'm wondering which of the earlier versions of TensorFlow have gpu support for Mac OS? And how can I install on Anaconda?
As stated on the official site:
Note: As of version 1.2, TensorFlow no longer provides GPU support on
Mac OS X.
..so installing any earlier version should be fine. But since your hardware does not have NVIDIA graphics card with CUDA support, it doesn't matter anyway.
In terms of installing TensorFlow on Mac OSX using Anaconda, you can just follow steps nicely described in the official docs
TensorFlow relies on CUDA for GPU use so you need Nvidia GPU. There's experimental work on adding OpenCL support to TensorFlow, but it's not supported on MacOS.
On anecdotal note, I've heard bad things from people trying to use AMD cards for deep learning. Basically AMD doesn't care about deep learning, they change their interfaces without notice so things break or run slower than CPU.