Google Colab Tensorflow 1.15 GPU - tensorflow

Does anyone know if Google Colab's GPUs are only compatible with tensorflow versions 2.x? I'm trying to run tensorflow 1 code, so I am pip installing tensorflow 1.15, also pip installing tensorflow 1.15 gpu, and changing my notebook settings to enable GPU, however I don't seem to see the GPU speed up effects?

Related

Tensorflow Loss function is NAN when using GPU

I am trying to train custom object detection model using pre-trained model from Tensorflow1 Model ZOO.
I am using model ssd_mobilenet_v2_coco_2018_03_29
I created suitable environment for training following this tutorial :https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/tensorflow-1.14/training.html
The thing is, when I tried to train the model using tensorflow-gpu==1.14.0 I always got the error saying Model diverged with loss = NaN.
Then I tried to uninstall tensorflow-gpu==1.14.0 and install tensorflow==1.14.0 (so it did not use my GPU) and all of sudden it started to work !
I have no idea how is that possible...
Command I am using -
python model_main.py --alsologtostderr --model_dir=models\ssd_mobilenet_v2_coco_2018_03_29\export --pipeline_config_path=models\ssd_mobilenet_v2_coco_2018_03_29\pipeline.config --num_train_steps=2000
Python version is 3.7
OS is Windows 10
My Graphics Card is Nvidia GeForce RTX3050, I used CUDA v10.0 and cuDNN v7.4.1
Any ideas ?
This is because RTX30's don't support cuda 10. If you need tf v1 (1.15) you can install nvidia's tensorflow (1.15) that can run on cuda 11.
pip install nvidia-pyindex
pip install nvidia-tensorflow[horovod]
Note: Only supports Python 3.6 or 3.8 [Not 3.7]
https://developer.nvidia.com/blog/accelerating-tensorflow-on-a100-gpus/

How to use system GPU in Jupyter notebook?

I tried a lot of things before I could finally figure out this approach. There are a lot of videos and blogs asking to install the Cuda toolkit and cuDNN from the website. Checking the compatible version. But this is not required anymore all you have to do is the following
pip install tensorflow-gpu
pip install cuda
pip install cudnn
then use the following code to check if your GPU is active in the current notebook
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
tf.config.list_physical_devices('GPU')
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
tf.test.is_built_with_cuda()
tf.debugging.set_log_device_placement(True)
I just want to confirm, if these steps are enough to enable GPU in jupyter notebook or am I missing something here?
If you installed the compatible versions of CUDA and cuDNN (relative to your GPU), Tensorflow should use that since you installed tensorflow-gpu. If you want to be sure, run a simple demo and check out the usage on the task manager.

Tensorflow after 1.15 - No need to install tensorflow-gpu package

Question
Please confirm that to use both CPU and GPU with TensorFlow after 1.15, install tensorflow package is enough and tensorflow-gpu is no more required.
Background
Still see articles stating to install tensorflow-gpu e.g. pip install tensorflow-gpu==2.2.0 and the PyPi repository for tensorflow-gpu package is active with the latest tensorflow-gpu 2.4.1.
The Annaconda document also refers to tensorflow-gpu package still.
Working with GPU packages - Available packages - TensorFlow
TensorFlow is a general machine learning library, but most popular for deep learning applications. There are three supported variants of the tensorflow package in Anaconda, one of which is the NVIDIA GPU version. This is selected by installing the meta-package tensorflow-gpu:
However, according to the TensorFlow v2.4.1 (as of Apr 2021) Core document GPU support - Older versions of TensorFlow
For releases 1.15 and older, CPU and GPU packages are separate:
pip install tensorflow==1.15 # CPU
pip install tensorflow-gpu==1.15 # GPU
According to the TensorFlow Core Guide Use a GPU.
TensorFlow code, and tf.keras models will transparently run on a single GPU with no code changes required.
According to Difference between installation libraries of TensorFlow GPU vs CPU.
Just a quick (unnecessary?) note... from TensorFlow 2.0 onwards these are not separated, and you simply install tensorflow (as this includes GPU support if you have an appropriate card/CUDA installed).
Hence would like to have a definite confirmation that the tensorflow-gpu package would be for convenience (legacy script which has specified tensorflow-gpu, etc) only and no more required. There is no difference between tensorflow and tensorflow-gpu packages now.
It's reasonable to get confused here about the package naming. However, here is my understanding. For tf 1.15 or older, the CPU and GPU packages are separate:
pip install tensorflow==1.15 # CPU
pip install tensorflow-gpu==1.15 # GPU
So, if I want to work entirely on the CPU version of tf, I would go with the first command and otherwise, if I want to work entirely on the GPU version of tf, I would go with the second command.
Now, in tf 2.0 or above, we only need one command that will conveniently work on both hardware. So, in the CPU and GPU based system, we need the same command to install tf, and that is:
pip install tensorflow
Now, we can test it on a CPU based system ( no GPU)
import tensorflow as tf
print(tf.__version__)
print('1: ', tf.config.list_physical_devices('GPU'))
print('2: ', tf.test.is_built_with_cuda)
print('3: ', tf.test.gpu_device_name())
print('4: ', tf.config.get_visible_devices())
2.4.1
1: []
2: <function is_built_with_cuda at 0x7f2ce91415f0>
3:
4: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
or also test it on a CPU based system ( with GPU)
2.4.1
1: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2: <function is_built_with_cuda at 0x7fb6affd0560>
3: /device:GPU:0
4: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
So, as you can see this is just a single command for both CPU and GPU cases. Hope it's clear now more. But until now (in tf > = 2) we can also use -gpu / -cpu postfix while installing tf that delicately use for GPU / CPU respectively.
!pip install tensorflow-gpu
....
Installing collected packages: tensorflow-gpu
Successfully installed tensorflow-gpu-2.4.1
# -------------------------------------------------------------
!pip install tensorflow-cpu
....
Installing collected packages: tensorflow-cpu
Successfully installed tensorflow-cpu-2.4.1
Check: Similar response from tf-team.

Tensorflow GPU is not identifying my GPUs

I have been trying to use Tensorflow GPU, but apparently, Tersorflow is not identifying my GPUs.
When I run:
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
As an output, only my CPU shows up. I have checked all of the versions of everything and they seem to be compatible. I have CUDA 10.1 with CUDA Toolkit, cuDNN 7.5 and Tensorflow 1.13.1. I am running everything on Ubuntu 18.xx
What am I doing wrong?
what is the output of:
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
On my system, tensorflow is not recognizing GPU because it is a XLA_GPU. I'm not really sure why a XLA_GPU is not also a GPU, seems there is a OR statement missing somewhere in the tensorflow-gpu code.
If above code does not list any GPUs (and you have one):
pip uninstall tensorflow
pip uninstall tensorflow-gpu
pip install tensorflow-gpu
… worked for me.

xgboost install on tensorflow GPU support

I already install tensorflow GPU support.
try install xgboost on tensorflow by
'conda install -c anaconda py-xgboost'
I wonder the xgboost what GPU support or not.
I don't install https://xgboost.readthedocs.io/en/latest/build.html#building-with-gpu-support
only tensorflow GPU support.
Do i need install xgboost Gpu support or not??? if i want use xgboost with GPU support
You can check if your xgboost is compiled for gpu, just try to run some model with tree_method='gpu_hist' or another gpu method (here).
If it would raise an error that xgboost's not compiled for gpu, then reinstall it following the instructions that you have found.
Probably, you don't need install CUDA (if you have successfully installed tensorflow-gpu and it works, then CUDA must be installed already), but you definitely should build gpu-supported xgboost.