I've coded a Neural Network from scratch in Python and I am using Google Colaboratory to train it. However, if I enable GPU or TPU acceleration, the training is not faster.
When you search for examples online, all of them use Tensorflow and other libraries, and their training times are shorter with GPU than without it.
Am I doing it correctly or am I missing something and the GPU is not being used?
Just enabling GPU or TPU won't help your problem, you need to explicitly code them to run on GPU if you are not using any frameworks or libraries.
Related
Every time I need to train a 'large' deep learning model I do it from Google Collab, as it allows you to use GPU acceleration.
My pc has a dedicated GPU, I was wondering if it is possible to use it to run my notebooks locally in a fast way. Is it possible to train models using my pc GPU? In that case, how?
I am open to work with DataSpell, VSCode or any other IDE.
Nicholas Renotte has a great 'Getting Started' video that goes through the entire process of setting up GPU accelerated notebooks on your PC. The stuff you're interested starts around the 12 minute mark.
Yes, it is possible to run .ipynb notebooks locally using GPU acceleration. To do so, you will need to install the necessary libraries and frameworks such as TensorFlow, PyTorch, or Keras. Depending on the IDE you choose, you will need to install the relevant plugins and packages for GPU acceleration.
In terms of IDEs, DataSpell, VSCode, PyCharm, and Jupyter Notebook are all suitable for running notebooks locally with GPU acceleration.
Once the necessary libraries and frameworks are installed, you will then need to install the appropriate drivers for your GPU and configure the environment for GPU acceleration.
Finally, you will need to modify the .ipynb notebook to enable GPU acceleration and specify the number of GPUs you will be using. Once all the necessary steps have been taken, you will then be able to run the notebook locally with GPU acceleration.
The question is pretty straightforward but nothing has really been answered.
Pretty simple, how do I know that when I build a Sequential() model in tensorflow via Keras it's going to use my GPU?
Normally, in Torch, so easy just use 'device' parameter and can verify via nvidia-smi volatility metric. I tried it while building model in TF but nvidia-smi shows 0% usage across all GPU devices.
Tensorflow uses GPU for most of the operations by default when
It detects at least one GPU
Its GPU support is installed and configured properly. For information regarding how to install and configure it properly for GPU support: https://www.tensorflow.org/install/gpu
One of the requirements to emphasize is that specific version of CUDA library has to be installed. e.g. Tensorflow 2.5 requires CUDA 11.2. Check here for the CUDA version required for each version of TF.
To know whether it detects GPU devices:
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
It will also print out debug messages by default to stderr to indicate whether the GPU support is configured properly and whether it detects GPU devices.
To validate using nvidia-smi that it is really using GPU:
You have to define a sufficiently deep and complex neural network model such that the bottleneck is in the GPU side. This can be achieved by increasing the number of layers and the number of channels in each of the layers.
When doing training or inference of the model like model.fit() and model.evaluate(), the GPU utilization in the logging from nvidia-smi should be high.
To know exactly where each operation will be executed, you can add the following line in the beginning of your codes
tf.debugging.set_log_device_placement(True)
For more information: https://www.tensorflow.org/guide/gpu
I'm fairly sure this is a very stupid question but I can't get it off my brain. So I'm sure that you know that you can use CUDA or ROCm to accelerate learning in TensorFlow/Keras, but I was just wondering if there was any way that a Raspberry Pi 4 with its GPU could help with training?
I don't know what you mean by "help".. but in the raspberry pi 4 the GPU is a Videocore VI, integrated. It does not support CUDA and you can not use an external GPU (there are not connections dedicated to it). You could only train on CPU but, Raspberry is a resource-limited device, forget about it. You can do inference on raspberry.
You should train on a computer and test the model. If it worked, save your model weights and structure and deploy it to a RaspberryPie.
Anyone with experience using vast.ai for cloud GPU computing knows if when renting more than one GPU do you need to do some setup to take advantage of the extra GPUs?
Because I can't notice any difference on speed when renting 6 or 8 GPUs instead of just one. I'm new at using vast.ai for cloud GPU computing.
I am using this default docker:
Official docker images for deep learning framework TensorFlow (http://www.tensorflow.org)
Successfully loaded tensorflow/tensorflow:nightly-gpu-py3
And just installing keras afterwards:
pip install keras
I have also checked the available GPUs using this and all the GPUs are detected correctly:
from keras import backend as K
K.tensorflow_backend._get_available_gpus()
cheers
Solution:
Finally I found the solution myself. I just used another docker image with an older version of tensorflow (2.0.0), and the error disappeared.
I have previously asked if it is possible to run tensor flow with gpu support on a cpu. I was told that it is possible and the basic code to switch which device I want to use but not how to get the initial code working on a computer that doesn't have a gpu at all. For example I would like to train on a computer that has a NVidia gpu but program on a laptop that only has a cpu. How would I go about doing this? I have tried just writing the code as normal but it crashes before I can even switch which device I want to use. I am using Python on Linux.
This thread might be helpful: Tensorflow: ImportError: libcusolver.so.8.0: cannot open shared object file: No such file or directory
I've tried to import tensorflow with tensorflow-gpu loaded in the uni's HPC login node, which does not have GPUs. It works well. I don't have Nvidia GPU in my laptop, so I never go through the installation process. But I think the cause is it cannot find relevant libraries of CUDA, cuDNN.
But, why don't you just use cpu version? As #Finbarr Timbers mentioned, you still can run a model in a computer with GPU.
What errors are you getting? It is very possible to train on a GPU but develop on a CPU- many people do it, including myself. In fact, Tensorflow will automatically put your code on a GPU if possible.
If you add the following code to your model, you can see which devices are being used:
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
This should change when you run your model on a computer with a GPU.