Tensorflow GPU is not identifying my GPUs - tensorflow

I have been trying to use Tensorflow GPU, but apparently, Tersorflow is not identifying my GPUs.
When I run:
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
As an output, only my CPU shows up. I have checked all of the versions of everything and they seem to be compatible. I have CUDA 10.1 with CUDA Toolkit, cuDNN 7.5 and Tensorflow 1.13.1. I am running everything on Ubuntu 18.xx
What am I doing wrong?

what is the output of:
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
On my system, tensorflow is not recognizing GPU because it is a XLA_GPU. I'm not really sure why a XLA_GPU is not also a GPU, seems there is a OR statement missing somewhere in the tensorflow-gpu code.
If above code does not list any GPUs (and you have one):
pip uninstall tensorflow
pip uninstall tensorflow-gpu
pip install tensorflow-gpu
… worked for me.

Related

Tensorflow not detecting my gpu even with all requisite files installed

I had tensorflow gpu 2.10 installed and it was working well. I mistakenly decided to upgrade to 2.11 without knowing it doesnt support gpu in windows. So I uninstalled it and reinstalled tensorflow gpu 2.10. Problem is that now it doesnt detect my gpu.
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Flatten, MaxPooling2D, BatchNormalization
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
print(tf.__version__)
print(tf.test.is_built_with_gpu_support())
The above code gives the output:
Num GPUs Available: 0
2.10.0
True
So the code detects that I have TF built with gpu support yet its not detecting it. My GPU is GTX960m with CUDA 12.0 and CuDNN 8.7.
Here is a solution (might not be the optimal, but is a solution).
It is a mix between these two sites:
https://towardsdatascience.com/setting-up-tensorflow-gpu-with-cuda-and-anaconda-onwindows-2ee9c39b5c44
https://www.tensorflow.org/install/source_windows
The first one works, but it leaves you with an older version of Tensorflow.
The new configuration should be:
Python: 3.10
Microsoft Visual Studio (MSVS): 2019
CUDA: 11.2
tensorflow_gpu-2.10.0 (for some reason, I couldn't install 2.11, but 2.10 worked ok)
The algorithm is:
Install Anaconda (if it is not already installed)
Go to Anaconda Prompt, and write:
conda create --name tf-gpu
conda activate tf-gpu
conda install python=3.10
conda install -c anaconda cudatoolkit=11.2
conda install pip
pip install tensorflow-gpu==2.10
That's it, hope it works (it did for me).
Remember to activate your tf-gpu environment each time you want to use it.

How to use system GPU in Jupyter notebook?

I tried a lot of things before I could finally figure out this approach. There are a lot of videos and blogs asking to install the Cuda toolkit and cuDNN from the website. Checking the compatible version. But this is not required anymore all you have to do is the following
pip install tensorflow-gpu
pip install cuda
pip install cudnn
then use the following code to check if your GPU is active in the current notebook
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
tf.config.list_physical_devices('GPU')
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
tf.test.is_built_with_cuda()
tf.debugging.set_log_device_placement(True)
I just want to confirm, if these steps are enough to enable GPU in jupyter notebook or am I missing something here?
If you installed the compatible versions of CUDA and cuDNN (relative to your GPU), Tensorflow should use that since you installed tensorflow-gpu. If you want to be sure, run a simple demo and check out the usage on the task manager.

Tensorflow after 1.15 - No need to install tensorflow-gpu package

Question
Please confirm that to use both CPU and GPU with TensorFlow after 1.15, install tensorflow package is enough and tensorflow-gpu is no more required.
Background
Still see articles stating to install tensorflow-gpu e.g. pip install tensorflow-gpu==2.2.0 and the PyPi repository for tensorflow-gpu package is active with the latest tensorflow-gpu 2.4.1.
The Annaconda document also refers to tensorflow-gpu package still.
Working with GPU packages - Available packages - TensorFlow
TensorFlow is a general machine learning library, but most popular for deep learning applications. There are three supported variants of the tensorflow package in Anaconda, one of which is the NVIDIA GPU version. This is selected by installing the meta-package tensorflow-gpu:
However, according to the TensorFlow v2.4.1 (as of Apr 2021) Core document GPU support - Older versions of TensorFlow
For releases 1.15 and older, CPU and GPU packages are separate:
pip install tensorflow==1.15 # CPU
pip install tensorflow-gpu==1.15 # GPU
According to the TensorFlow Core Guide Use a GPU.
TensorFlow code, and tf.keras models will transparently run on a single GPU with no code changes required.
According to Difference between installation libraries of TensorFlow GPU vs CPU.
Just a quick (unnecessary?) note... from TensorFlow 2.0 onwards these are not separated, and you simply install tensorflow (as this includes GPU support if you have an appropriate card/CUDA installed).
Hence would like to have a definite confirmation that the tensorflow-gpu package would be for convenience (legacy script which has specified tensorflow-gpu, etc) only and no more required. There is no difference between tensorflow and tensorflow-gpu packages now.
It's reasonable to get confused here about the package naming. However, here is my understanding. For tf 1.15 or older, the CPU and GPU packages are separate:
pip install tensorflow==1.15 # CPU
pip install tensorflow-gpu==1.15 # GPU
So, if I want to work entirely on the CPU version of tf, I would go with the first command and otherwise, if I want to work entirely on the GPU version of tf, I would go with the second command.
Now, in tf 2.0 or above, we only need one command that will conveniently work on both hardware. So, in the CPU and GPU based system, we need the same command to install tf, and that is:
pip install tensorflow
Now, we can test it on a CPU based system ( no GPU)
import tensorflow as tf
print(tf.__version__)
print('1: ', tf.config.list_physical_devices('GPU'))
print('2: ', tf.test.is_built_with_cuda)
print('3: ', tf.test.gpu_device_name())
print('4: ', tf.config.get_visible_devices())
2.4.1
1: []
2: <function is_built_with_cuda at 0x7f2ce91415f0>
3:
4: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
or also test it on a CPU based system ( with GPU)
2.4.1
1: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2: <function is_built_with_cuda at 0x7fb6affd0560>
3: /device:GPU:0
4: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
So, as you can see this is just a single command for both CPU and GPU cases. Hope it's clear now more. But until now (in tf > = 2) we can also use -gpu / -cpu postfix while installing tf that delicately use for GPU / CPU respectively.
!pip install tensorflow-gpu
....
Installing collected packages: tensorflow-gpu
Successfully installed tensorflow-gpu-2.4.1
# -------------------------------------------------------------
!pip install tensorflow-cpu
....
Installing collected packages: tensorflow-cpu
Successfully installed tensorflow-cpu-2.4.1
Check: Similar response from tf-team.

Tensorflow import error: Naive Pip Windows 10

having problems while
screenshot given below. I want the solution badly.
import tensorflow as tf
https://i.stack.imgur.com/jgghK.png
You're using tensorflow-gpu but do not have CUDA / cuDNN installed on your computer.
Shor answer: run pip unintall tensorflow-gpu and pip install tensorflow.
Long answer: Install CUDA / cuDNN

Tensorflow doesn't seem to see my gpu

I've tried tensorflow on both cuda 7.5 and 8.0, w/o cudnn (my GPU is old, cudnn doesn't support it).
When I execute device_lib.list_local_devices(), there is no gpu in the output. Theano sees my gpu, and works fine with it, and examples in /usr/share/cuda/samples work fine as well.
I installed tensorflow through pip install. Is my gpu too old for tf to support it? gtx 460
I came across this same issue in jupyter notebooks. This could be an easy fix.
$ pip uninstall tensorflow
$ pip install tensorflow-gpu
You can check if it worked with:
tf.test.gpu_device_name()
Update 2020
It seems like tensorflow 2.0+ comes with gpu capabilities therefore
pip install tensorflow should be enough
Summary:
check if tensorflow sees your GPU (optional)
check if your videocard can work with tensorflow (optional)
find versions of CUDA Toolkit and cuDNN SDK, compatible with your tf version
install CUDA Toolkit
install cuDNN SDK
pip uninstall tensorflow; pip install tensorflow-gpu
check if tensorflow sees your GPU
* source - https://www.tensorflow.org/install/gpu
Detailed instruction:
check if tensorflow sees your GPU (optional)
from tensorflow.python.client import device_lib
def get_available_devices():
local_device_protos = device_lib.list_local_devices()
return [x.name for x in local_device_protos]
print(get_available_devices())
# my output was => ['/device:CPU:0']
# good output must be => ['/device:CPU:0', '/device:GPU:0']
check if your card can work with tensorflow (optional)
my PC: GeForce GTX 1060 notebook (driver version - 419.35), windows 10, jupyter notebook
tensorflow needs Compute Capability 3.5 or higher. (https://www.tensorflow.org/install/gpu#hardware_requirements)
https://developer.nvidia.com/cuda-gpus
select "CUDA-Enabled GeForce Products"
result - "GeForce GTX 1060 Compute Capability = 6.1"
my card can work with tf!
find versions of CUDA Toolkit and cuDNN SDK, that you need
a) find your tf version
import sys
print (sys.version)
# 3.6.4 |Anaconda custom (64-bit)| (default, Jan 16 2018, 10:22:32) [MSC v.1900 64 bit (AMD64)]
import tensorflow as tf
print(tf.__version__)
# my output was => 1.13.1
b) find right versions of CUDA Toolkit and cuDNN SDK for your tf version
https://www.tensorflow.org/install/source#linux
* it is written for linux, but worked in my case
see, that tensorflow_gpu-1.13.1 needs: CUDA Toolkit v10.0, cuDNN SDK v7.4
install CUDA Toolkit
a) install CUDA Toolkit 10.0
https://developer.nvidia.com/cuda-toolkit-archive
select: CUDA Toolkit 10.0 and download base installer (2 GB)
installation settings: select only CUDA
(my installation path was: D:\Programs\x64\Nvidia\Cuda_v_10_0\Development)
b) add environment variables:
system variables / path must have:
D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\bin
D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\libnvvp
D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\extras\CUPTI\libx64
D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\include
install cuDNN SDK
a) download cuDNN SDK v7.4
https://developer.nvidia.com/rdp/cudnn-archive (needs registration, but it is simple)
select "Download cuDNN v7.4.2 (Dec 14, 2018), for CUDA 10.0"
b) add path to 'bin' folder into "environment variables / system variables / path":
D:\Programs\x64\Nvidia\cudnn_for_cuda_10_0\bin
pip uninstall tensorflow
pip install tensorflow-gpu
check if tensorflow sees your GPU
- restart your PC
- print(get_available_devices())
- # now this code should return => ['/device:CPU:0', '/device:GPU:0']
If you are using conda, you might have installed the cpu version of the tensorflow. Check package list (conda list) of the environment to see if this is the case . If so, remove the package by using conda remove tensorflow and install keras-gpu instead (conda install -c anaconda keras-gpu. This will install everything you need to run your machine learning codes in GPU. Cheers!
P.S. You should check first if you have installed the drivers correctly using nvidia-smi. By default, this is not in your PATH so you might as well need to add the folder to your path. The .exe file can be found at C:\Program Files\NVIDIA Corporation\NVSMI
When I look up your GPU, I see that it only supports CUDA Compute Capability 2.1. (Can be checked through https://developer.nvidia.com/cuda-gpus) Unfortunately, TensorFlow needs a GPU with minimum CUDA Compute Capability 3.0.
https://www.tensorflow.org/get_started/os_setup#optional_install_cuda_gpus_on_linux
You might see some logs from TensorFlow checking your GPU, but ultimately the library will avoid using an unsupported GPU.
The following worked for me, hp laptop. I have a Cuda Compute capability
(version) 3.0 compatible Nvidia card. Windows 7.
pip3.6.exe uninstall tensorflow-gpu
pip3.6.exe uninstall tensorflow-gpu
pip3.6.exe install tensorflow-gpu
I had a problem because I didn't specify the version of Tensorflow so my version was 2.11. After many hours I found that my problem is described in install guide:
Caution: TensorFlow 2.10 was the last TensorFlow release that supported GPU on native-Windows. Starting with TensorFlow 2.11, you will need to install TensorFlow in WSL2, or install tensorflow-cpu and, optionally, try the TensorFlow-DirectML-Plugin
Before that, I read most of the answers to this and similar questions. I followed #AndrewPt answer. I already had installed CUDA but updated the version just in case, installed cudNN, and restarted the computer.
The easiest solution for me was to downgrade to 2.10 (you can try different options mentioned in the install guide). I first uninstalled all of these packages (probably it's not necessary, but I didn't want to see how pip messed up versions at 2 am):
pip uninstall keras
pip uninstall tensorflow-io-gcs-filesystem
pip uninstall tensorflow-estimator
pip uninstall tensorflow
pip uninstall Keras-Preprocessing
pip uninstall tensorflow-intel
because I wanted only packages required for the old version, and I didn't do it for all required packages for 2.11 version. After that I installed tensorflow 2.10:
pip install tensorflow<2.11
and it worked.
I used this code to check if GPU is visible:
import tensorflow as tf
print(tf.config.list_physical_devices('GPU'))
So as of 2022-04, the tensorflow package contains both CPU and GPU builds. To install a GPU build, search to see what's available:
λ conda search tensorflow
Loading channels: done
# Name Version Build Channel
tensorflow 0.12.1 py35_1 conda-forge
tensorflow 0.12.1 py35_2 conda-forge
tensorflow 1.0.0 py35_0 conda-forge
…
tensorflow 2.5.0 mkl_py39h1fa1df6_0 pkgs/main
tensorflow 2.6.0 eigen_py37h37bbdb1_0 pkgs/main
tensorflow 2.6.0 eigen_py38h63d3545_0 pkgs/main
tensorflow 2.6.0 eigen_py39h855417c_0 pkgs/main
tensorflow 2.6.0 gpu_py37h3e8f0e3_0 pkgs/main
tensorflow 2.6.0 gpu_py38hc0e8100_0 pkgs/main
tensorflow 2.6.0 gpu_py39he88c5ba_0 pkgs/main
tensorflow 2.6.0 mkl_py37h9623b36_0 pkgs/main
tensorflow 2.6.0 mkl_py38hdc16138_0 pkgs/main
tensorflow 2.6.0 mkl_py39h31650da_0 pkgs/main
You can see that there are builds of TF 2.6.0 that support Python 3.7, 3.8 and 3.9, and that are built for MKL (Intel CPU), Eigen, or GPU.
To narrow it down, you can use wildcards in the search. This will find any Tensorflow 2.x version that is built for GPU, for instance:
λ conda search tensorflow=2*=gpu*
Loading channels: done
# Name Version Build Channel
tensorflow 2.0.0 gpu_py36hfdd5754_0 pkgs/main
tensorflow 2.0.0 gpu_py37h57d29ca_0 pkgs/main
tensorflow 2.1.0 gpu_py36h3346743_0 pkgs/main
tensorflow 2.1.0 gpu_py37h7db9008_0 pkgs/main
tensorflow 2.5.0 gpu_py37h23de114_0 pkgs/main
tensorflow 2.5.0 gpu_py38h8e8c102_0 pkgs/main
tensorflow 2.5.0 gpu_py39h7dc34a2_0 pkgs/main
tensorflow 2.6.0 gpu_py37h3e8f0e3_0 pkgs/main
tensorflow 2.6.0 gpu_py38hc0e8100_0 pkgs/main
tensorflow 2.6.0 gpu_py39he88c5ba_0 pkgs/main
To install a specific version in an otherwise empty environment, you can use a command like:
λ conda activate tf
(tf) λ conda install tensorflow=2.6.0=gpu_py39he88c5ba_0
…
The following NEW packages will be INSTALLED:
_tflow_select pkgs/main/win-64::_tflow_select-2.1.0-gpu
…
cudatoolkit pkgs/main/win-64::cudatoolkit-11.3.1-h59b6b97_2
cudnn pkgs/main/win-64::cudnn-8.2.1-cuda11.3_0
…
tensorflow pkgs/main/win-64::tensorflow-2.6.0-gpu_py39he88c5ba_0
tensorflow-base pkgs/main/win-64::tensorflow-base-2.6.0-gpu_py39hb3da07e_0
…
As you can see, if you install a GPU build, it will automatically also install compatible cudatoolkit and cudnn packages. You don't need to manually check versions for compatibility, or manually download several gigabytes from Nvidia's website, or register as a developer, as it says in other answers or on the official website.
After installation, confirm that it worked and it sees the GPU by running:
λ python
Python 3.9.12 (main, Apr 4 2022, 05:22:27) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.__version__
'2.6.0'
>>> tf.config.list_physical_devices()
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
Getting conda to install a GPU build and other packages you want to use is another story, however, because there are a lot of package incompatibilities for me. I think the best you can do is specify the installation criteria using wildcards and cross your fingers.
This tries to install any TF 2.x version that's built for GPU and that has dependencies compatible with Spyder and matplotlib's dependencies, for instance:
λ conda install tensorflow=2*=gpu* spyder matplotlib
For me, this ended up installing a two year old GPU version of tensorflow:
matplotlib pkgs/main/win-64::matplotlib-3.5.1-py37haa95532_1
spyder pkgs/main/win-64::spyder-5.1.5-py37haa95532_1
tensorflow pkgs/main/win-64::tensorflow-2.1.0-gpu_py37h7db9008_0
I had previously been using the tensorflow-gpu package, but that doesn't work anymore. conda typically grinds forever trying to find compatible packages to install, and even when it's installed, it doesn't actually install a gpu build of tensorflow or the CUDA dependencies:
λ conda list
…
cookiecutter 1.7.2 pyhd3eb1b0_0
cryptography 3.4.8 py38h71e12ea_0
cycler 0.11.0 pyhd3eb1b0_0
dataclasses 0.8 pyh6d0b6a4_7
…
tensorflow 2.3.0 mkl_py38h8557ec7_0
tensorflow-base 2.3.0 eigen_py38h75a453f_0
tensorflow-estimator 2.6.0 pyh7b7c402_0
tensorflow-gpu 2.3.0 he13fc11_0
I have had an issue where I needed the latest TensorFlow (2.8.0 at the time of writing) with GPU support running in a conda environment. The problem was that it was not available via conda. What I did was
conda install cudatoolkit==11.2
pip install tensorflow-gpu==2.8.0
Although I've cheched that the cuda toolkit version was compatible with the tensorflow version, it was still returning an error, where libcudart.so.11.0 was not found. As a result, GPUs were not visible. The remedy was to set environmental variable LD_LIBRARY_PATH to point to your anaconda3/envs/<your_tensorflow_environment>/lib with this command
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/<user>/anaconda3/envs/<your_tensorflow_environment>/lib
Unless you make it permanent, you will need to create this variable every time you start a terminal prior to a session (jupyter notebook). It can be conveniently automated by following this procedure from conda's official website.
In my case, I had a working tensorflow-gpu version 1.14 but suddenly it stopped working. I fixed the problem using:
pip uninstall tensorflow-gpu==1.14
pip install tensorflow-gpu==1.14
I experienced the same problem on my Windows OS. I followed tensorflow's instructions on installing CUDA, cudnn, etc., and tried the suggestions in the answers above - with no success.
What solved my issue was to update my GPU drivers. You can update them via:
Pressing windows-button + r
Entering devmgmt.msc
Right-Clicking on "Display adapters" and clicking on the "Properties" option
Going to the "Driver" tab and selecting "Updating Driver".
Finally, click on "Search automatically for updated driver software"
Restart your machine and run the following check again:
from tensorflow.python.client import device_lib
local_device_protos = device_lib.list_local_devices()
[x.name for x in local_device_protos]
Sample output:
2022-01-17 13:41:10.557751: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce 940MX major: 5 minor: 0 memoryClockRate(GHz): 1.189
pciBusID: 0000:01:00.0
2022-01-17 13:41:10.558125: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2022-01-17 13:41:10.562095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2022-01-17 13:45:11.392814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-01-17 13:45:11.393617: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2022-01-17 13:45:11.393739: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2022-01-17 13:45:11.401271: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 1391 MB memory) -> physical GPU (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute capability: 5.0)
>>> [x.name for x in local_device_protos]
['/device:CPU:0', '/device:GPU:0']