Run CNTK on machine with multiple GPUs - cntk

I'm running CNTK on an Azure VM with 2 x K80 GPUs.
Do I need to do anything special in my Python (Jupyter Notebook) code in order to take advantage of the multiple GPUs or will CNTK automatically take advantage of both GPUs?

Currently automatic multi-GPU training in CNTK is supported via MPI. Launching multi-GPU training in Notebook is possible using multiprocessing.
Here is more details on how to run CNTK with multiple GPUs.

Related

Is it possible to run .ipynb notebooks locally using GPU acceleration? How?

Every time I need to train a 'large' deep learning model I do it from Google Collab, as it allows you to use GPU acceleration.
My pc has a dedicated GPU, I was wondering if it is possible to use it to run my notebooks locally in a fast way. Is it possible to train models using my pc GPU? In that case, how?
I am open to work with DataSpell, VSCode or any other IDE.
Nicholas Renotte has a great 'Getting Started' video that goes through the entire process of setting up GPU accelerated notebooks on your PC. The stuff you're interested starts around the 12 minute mark.
Yes, it is possible to run .ipynb notebooks locally using GPU acceleration. To do so, you will need to install the necessary libraries and frameworks such as TensorFlow, PyTorch, or Keras. Depending on the IDE you choose, you will need to install the relevant plugins and packages for GPU acceleration.
In terms of IDEs, DataSpell, VSCode, PyCharm, and Jupyter Notebook are all suitable for running notebooks locally with GPU acceleration.
Once the necessary libraries and frameworks are installed, you will then need to install the appropriate drivers for your GPU and configure the environment for GPU acceleration.
Finally, you will need to modify the .ipynb notebook to enable GPU acceleration and specify the number of GPUs you will be using. Once all the necessary steps have been taken, you will then be able to run the notebook locally with GPU acceleration.

Can Horovod with TensorFlow work on non-GPU instances in Amazon SageMaker?

I want to perform distributed training on Amazon SageMaker. The code is written with TensorFlow and similar to the following code where I think CPU instance should be enough: 
https://github.com/horovod/horovod/blob/master/examples/tensorflow_word2vec.py
Can Horovod with TensorFlow work on non-GPU instances in Amazon SageMaker?
Yeah you should be able to use both CPU's and GPU's with Horovod on Amazon SageMaker. Please follow the below example for the same
https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-python-sdk/tensorflow_script_mode_horovod/tensorflow_script_mode_horovod.ipynb

More than one GPU on vast.ai

Anyone with experience using vast.ai for cloud GPU computing knows if when renting more than one GPU do you need to do some setup to take advantage of the extra GPUs?
Because I can't notice any difference on speed when renting 6 or 8 GPUs instead of just one. I'm new at using vast.ai for cloud GPU computing.
I am using this default docker:
Official docker images for deep learning framework TensorFlow (http://www.tensorflow.org)
Successfully loaded tensorflow/tensorflow:nightly-gpu-py3
And just installing keras afterwards:
pip install keras
I have also checked the available GPUs using this and all the GPUs are detected correctly:
from keras import backend as K
K.tensorflow_backend._get_available_gpus()
cheers
Solution:
Finally I found the solution myself. I just used another docker image with an older version of tensorflow (2.0.0), and the error disappeared.

tf.test.is_gpu_available() returns False on GCP

I am training a CNN on GCP's notebook using a Tesla V100. I've trained a simple yolo on my own custom data and it was pretty fast but not very accurate. So, I decided to write my own code from scratch to solve the specific aspects of the problem that I want to tackle.
I have tried to run my code on Google Colab prior to GCP, and it went well. Tensorflow detects the GPU and is able to use it whether it was a Tesla K80 or T4.
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
tf.test.is_gpu_available() #>>> True
My problem is that, this same function returns a False on GCP notebook, as if Tensorflow is unable to use the GPU it detected on GCP VM. I don't know of any command that forces Tensorflow to use the GPU over CPU, since it does that automatically.
I have already tried to install or uninstall and then install some versions of tensorflow, tensorflow-gpu and tf-nightly-gpu (1.13 and 2.0dev for instance) but it yielded nothing.
output of nvidia-smi
Have you tried using GCP's AI Platform Notebooks instead? They offer VMs that are pre-configured with Tensorflow and have all required GPU drivers installed.

Running Tensorboard without CUDA support

Is it possible to run Tensorboard on a machine without CUDA support?
I'm working at a computation center (via ssh) which has two major clusters:
CPU-Cluster which is a general workhorse without CUDA support (no dedicated GPU)
GPU-Cluster with dedicated GPUs e.g. for running neural networks with tensorflow-gpu.
The access to the GPU-cluster is limited to Training etc. such that I can't afford to run Tensorboard on a machine with CUDA-support. Instead, I'd like to run Tensorboard on the CPU-Cluster.
With the TF bundled Tensorboard I get import errors due to missing CUDA support.
It seems reasonable that the official Tensorboard should have a mode for running with CPU-only. Is this true?
I've also found an inofficial standalone Tensorboard version (github.com/dmlc/tensorboard), does this work without CUDA-support?
Solved my problem: just install tensorflow instead of tensorflow-gpu.
Didn't work for me for a while due to my virtual environment (conda), which didn't properly remove tensorflow-gpu.
Tensorboard is not limited by whether a machine has GPU or not.
And as far as I know, what Tensorboard do is parsing events pb files and display them on web. There is not computing, so it doesn't need GPU.