GPU card not picked up by tensorflow-gpu - tensorflow

I installed the tensorflow-gpu version, and tried to test the GPU setup as suggested
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
However, I got the following information
Num GPUs Available: 0
My machine does have GPU card, shown as follows, why it is not picked by Tensorflow

It is likely that you do not have the right combination of the following:
CUDA
CuDNN
TensorFlow
Please check my answer here, for a correct combination of the aforementioned : Tensorflow 2.0 can't use GPU, something wrong in cuDNN? :Failed to get convolution algorithm. This is probably because cuDNN failed to initialize

Related

Using the RTX 3070 laptop GPU for CNN model training with a windows system

I'm trying to use my laptop RTX 3070 GPU for CNN model training because I have to employ a exhastive grid search to tune the hyper parameters. I tried many different methods however, I could not get it done. Can anyone kindly point me in the right direction?
I followed the following procedure.
The procedure:
Installed the NVIDIA CUDA Toolkit 11.2
Installed NVIDIA cuDNN 8.1 by downloading and pasting the files (bin,include,lib) into the NVIDIA GPU Computing Toolkit/CUDA/V11.2
Setup the environment variable by including the path in the system path for both bin and libnvvm.
Installed tensorflow 2.11 and python 3.8 in a new conda environment.
However, I was unable to setup the system to use the GPU that is available. The code seems to be only using the CPU and when I query the following request I get the below output.
query:
import tensorflow as tf
print("TensorFlow version:", tf.__version__)
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
Output:
TensorFlow version: 2.11.0
Num GPUs Available: 0
Am I missing something here or anyone has the same issue like me?
You should use DirectML plugin. From tensorflow 2.11 Gpu support has been dropped for native windows. you need to use DirectML plugin.
You can follow the tutorial here to install

Anaconda installed CUDA CUdnn and Tensorflow, Why doesn't find the GPU?

I'm using Anaconda prompt to install:
Tensorflow 2.10.0
cudatoolkit 11.3.1
cudnn 8.2.1
I'm using Windows 11 and a RTX 3070 Nvidia graphic card. And all the drives have been updated.
And I tried downloading another version of CUDA and CUdnn in exe.file directly from CUDA website. And
added the directories into system path.The folder looks like this:
But whenever I type in:
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
It gives me Num GPUs Available: 0
And surely it eats my CPU for computing everything.
It once succeeded when I used Colab and use GPU as the accelerator then created a session using my GPU. That time the GPU load has been maximized. But later don't know how, I can't use my own GPU for training in Colab or even their default free GPU.
Please help. ChatGPT doesn't give me correct information since it only referred to knowledge before 2020. It keeps asking me to install 'tensorflow-gpu' which has already been removed.

tf.test.is_gpu_available() returns False on GCP

I am training a CNN on GCP's notebook using a Tesla V100. I've trained a simple yolo on my own custom data and it was pretty fast but not very accurate. So, I decided to write my own code from scratch to solve the specific aspects of the problem that I want to tackle.
I have tried to run my code on Google Colab prior to GCP, and it went well. Tensorflow detects the GPU and is able to use it whether it was a Tesla K80 or T4.
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
tf.test.is_gpu_available() #>>> True
My problem is that, this same function returns a False on GCP notebook, as if Tensorflow is unable to use the GPU it detected on GCP VM. I don't know of any command that forces Tensorflow to use the GPU over CPU, since it does that automatically.
I have already tried to install or uninstall and then install some versions of tensorflow, tensorflow-gpu and tf-nightly-gpu (1.13 and 2.0dev for instance) but it yielded nothing.
output of nvidia-smi
Have you tried using GCP's AI Platform Notebooks instead? They offer VMs that are pre-configured with Tensorflow and have all required GPU drivers installed.

Keras 2.2.4 with TensorFlow 1.4.1 crashing GPU instances

What is the minimum TensorFlow version requirement for Keras version 2.2.4?
I'm having trouble when using a Conv2D architecture, the GPU instance seems to crash, i.e. i can see the GPU memory fill up for a small bit and then the running processes just 'crash'.There is no error, the notebook just 'freezes'. Training dense models for example work fine. This exact same notebook with the Conv2D architecture works fine on my laptop with TensorFlow 1.12.0 & Keras 2.2.4.
I'm expecting that this has something to do with the used Keras & TensorFlow version. The GPU used is a Tesla M10 (that only supports CUDA 8.0?). The server with this M10 has Tensorflow version 1.4.1 and Keras 2.2.4.
Any insights into solving this problem would be really appreciated.
Version compatibility between keras and tensorflow is a problem that probably anyone has faced.
As in my answer here, one combination you could use is tensorflow-gpu 1.4 and keras 2.0.8 . You can also check here for more combinations too.
If you need to use keras 2.2.4 you will have to install tensorflow-gpu 1.11 and later, which needs cuda 9.
Keras - Tensorflow version's compatibility is problem that developers have faced many times.
Just check Tensorflow and Keras compatibility:
Check this link for more info

How can I check if keras/tensorflow is using cuDNN?

I have installed CUDA and cuDNN, but the last was not working, giving a lot of error messages in theano. Now I am training moderate sized deep conv nets in Keras/Tensorflow, without getting any cuDNN error messages. How can I check if cuDNN is now being used?
tl;dr: If tensorflow-gpu works, then CuDNN is used.
The prebuilt binaries of TensorFlow (at least since version 1.3) link to the CuDNN library. If CuDNN is missing, an error message will tell you ImportError: Could not find 'cudnn64_7.dll'. TensorFlow requires that this DLL be installed....
According to the TensorFlow install documentation for version 1.5, CuDNN must be installed for GPU support even if you build it from source. There are still a lot of fallbacks in the TensorFlow code for the case of CuDNN not being available -- as far as I can tell it used to be optional in prior versions.
Here are two lines from the TensorFlow source that explicitly tell and force that CuDNN is required for gpu acceleration.
There is a special GPU version of TensorFlow that needs to be installed in order to use the GPU (and CuDNN). Make sure the installed python package is tensorflow-gpu and not just tensorflow.
You can list the packages containing "tensorflow" with conda list tensorflow (or just pip list, if you do not use anaconda), but make sure you have the right environment activated.
When you run your scripts with GPU support, they will start like this:
Using TensorFlow backend.
2018- ... C:\tf_jenkins\...\gpu\gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7845
To test it, just type into the console:
import tensorflow as tf
tf.Session()
To check if you "see" the CuDNN from your python environment and therewith validate a correct PATH variable, you can try this:
import ctypes
ctypes.WinDLL("cudnn64_7.dll") # use the file name of your cudnn version here.
You might also want to look into the GPU optimized Keras Layers.
CuDNNLSTM
CuDNNGRU
They are significantly faster:
https://keras.io/layers/recurrent/#cudnnlstm
We saw a 10x improvement going from the LSTM to CuDNNLSTM Keras layers.
Note:
We also saw a 10x increase in VMS (virtual memory) usage on the machine. So there are tradeoffs to consider.