How to set number of gpus un Horovod by code? - horovod

is it possible to set number of Gpus by code using some Horovod methods or some environment var?
Thanks.

Related

How to specify which GPU to use when running tensorflow?

We have a DGX-1 in Lab.
I see many tasks are running on different GPU.
For MLperf docker application, I can use NV_GPU=x to assign which GPU to use.
However, I have a python Keras/TensorFlow code, I used this same way, the loading doesn't go to the specified GPU.
You could use CUDA_VISIBLE_DEVICES to specify the GPU to be used by your model:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = 0,1 #Will assign GPUs 0 and 1 to the model

How to use tensor cores instead of cuda cores in my code?

I have RTX2070 Nvidia graphic card which has tensor cores on it. I want to run my deep learning code utilizing tensor cores instead of Cuda cores. Is it possible in this graphic card? and isn't there any specific driver that I should install in order to do that?
and how should I check to know whether the model is running on tensor cores or Cuda cores?
I am using Keras framework on windows 10.
According to NVIDIA:
The container enables Tensor Core math by default
If you want to disable it you can set TF_DISABLE_CUDNN_TENSOR_OP_MATH to 1.

CUDA programming: Is occupancy the way to achieve GPU slicing among different process?

There are ways through which GPU sharing can be achieved. I came across occupancy. Can I use it to slice the GPU among the processes (e.g. tensorflow) which are sharing GPU? slice here means GPU resources are always dedicated to that process. Using occupancy I will get know GPU & SMs details and based on that I launch kernel stating that create blocks for these GPU resources.
I am using NVIDIA Corporation GK210GL [Tesla K80] with cuda 9 toolkit installed
Please suggest. Thanks!
There are ways through which GPU sharing can be achieved.
No there aren't. In general, there is no such thing as the type of GPU sharing that you envisage. There is the MPS server for MPI style multi process computing, but that is irrelevant in the context of running Tensorflow (see here for why MPS can't be used).
I came across occupancy. Can I use it to slice the GPU among the processes (e.g. tensorflow) which are sharing GPU?
No you can't. Occupancy is a performance metric. It has nothing whatsoever to do with the ability to share a GPUs resources amongst different processes,
Please suggest
Buy a second GPU.

Configuring Tensorflow to use all CPU's

Reading :
https://www.tensorflow.org/versions/r0.10/resources/faq.html it states :
Does TensorFlow make use of all the devices (GPUs and CPUs) available
on my machine?
TensorFlow supports multiple GPUs and CPUs. See the how-to
documentation on using GPUs with TensorFlow for details of how
TensorFlow assigns operations to devices, and the CIFAR-10 tutorial
for an example model that uses multiple GPUs.
Note that TensorFlow only uses GPU devices with a compute capability
greater than 3.5.
Does this mean Tensorflow can automatically make use of all CPU's on given machine or does it ned to be explicitly configured ?
CPUs are used via a "device" which is just a threadpool. You can control the number of threads if you feel like you need more:
sess = tf.Session(config=tf.ConfigProto(
intra_op_parallelism_threads=NUM_THREADS))

How to choose to use GPU with tf.device() in tensorflow

I run tensorflow example from models/cifar10/cifar10_train.py, in the example script, they said it will use one GPU, but I did find they define to use GPU like with tf.device('/gpu:0'):
and they define store some intermediate variable with CPU, so I wander how does tensorflow handle the GPU and CPU? To use GPU is the default choice???
Here is the link for the example:
https://www.tensorflow.org/versions/r0.10/tutorials/deep_cnn/index.html
Any advice will be appreciated,
Thanks in advance
Good day