How to use multiple GPUs for separate training with Tensorflow? - tensorflow

I have looked through many articles and posts about using multiple GPUs with TensorFlow. It helps me more here on "how to use parallel GPUs to train NN (neural network)". But I have a different question. Can a separate GPU be used to train different NNs at the same time?
More details:
I have neural networks A, B, and GPU1, GPU2. I want to train A NN on GPU1 and B NN on GPU2 at the same time. Is it possible?

I suggest using two separate python scripts to train both networks, such as trainA.py and trainB.py.
In the first two lines of trainA.py you select your preferred GPU.
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
For trainB.py you select the other GPU:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
Now you should be able to run both train scripts at the same time.

Related

Assign Torch and Tensorflow models two separate GPUs

I am comparing two pre-trained models, one is in Tensorflow and one is in Pytorch, on a machine that has multiple GPUs. Each model fits on one GPU. They are both loaded in the same Python script. How can I assign one GPU to the Tensorflow model and another GPU to the Pytorch model?
Setting CUDA_VISIBLE_DEVICES=0,1 only tells both models that these GPUs are available - how can I (within Python I guess), make sure that Tensorflow takes GPU 0 and Pytorch takes GPU 1?
You can refer to torch.device. https://pytorch.org/docs/stable/tensor_attributes.html?highlight=device#torch.torch.device
In particular do
device=torch.device("gpu:0")
tensor = tensor.to(device)
or to load a pretrained model
device=torch.device("gpu:0")
model = model.to(device)
to put tensor/model on gpu 0.
Similarly tensorflow has tf.device. https://www.tensorflow.org/api_docs/python/tf/device. Its usage is described here https://www.tensorflow.org/guide/using_gpu
for tensorflow to load model on gpu:0 do,
with tf.device("gpu:0"):
load_model_function(model_path)

How To Run Two Models In Parallel On Two Different GPUs In Keras

I want to do a grid search for parameters on neural nets. I have two GPUs, and I would like to run one model on the first GPU, and another model with different parameters on the second GPU. A first attempt that doesn't work goes like this:
with tf.device('/gpu:0'):
model_1 = sequential()
model_1.add(embedding) // the embeddings are defined earlier in the code
model_1.add(LSTM(50))
model_1.add(Dense(5, activation = 'softmax'))
model_1.compile(loss = 'categorical_crossentropy', optimizer = 'adam')
model_1.fit(np.array(train_x), np.array(train_y), epochs = 15, batch_size = 15)
with tf.device('/gpu:1'):
model_2 = sequential()
model_2.add(embedding)
model_2.add(LSTM(100))
model_2.add(Dense(5, activation = 'softmax'))
model_2.compile(loss = 'categorical_crossentropy', optimizer = 'adam')
model_2.fit(np.array(train_x), np.array(train_y), epochs = 15, batch_size = 15)
Edit: I ran my code again and did not get an error. However, the two models run sequentially rather than in parallel. Is it possible to do multithreading here? That is my next attempt.
There is a lot of discussion online about using multiple GPUs with keras, but when it comes to running multiple models simultaneously, the discussion is limited to running multiple models on a single GPU. The discussion regarding multiple GPUs is also limited to data parallelization and device parallelization. I don't believe I want to do either since I am not trying to break up a single model to run on multiple gpus. Is it possible to run two separate models simultaneously in keras with two GPUs?
A solution to this problem can be found here. However, the softmax activation function runs on the CPU only as of now. It is necessary to direct the cpu to perform the dense layer:
with tf.device('cpu:0')
Switching between the cpu and the gpu does not seem cause noticeable slow down. With LSTM's though, it may be best to run the entire model on the cpu.
You can use multi_gpu_model (reference here)
Define your model first
model = sequential()
model.add(embedding) // the embeddings are defined earlier in the code
model.add(LSTM(50))
model.add(Dense(5, activation = 'softmax'))
and the create a multi_gpu_model with 2 GPUs:
parallel_model = multi_gpu_model(model, gpus=2)
This will work if you want to divide the input and process it on 2 GPUs. It will not cover your use case of having two different models on two GPUs though.
Because your code is in sequential. You can try threading to run 2 blocks in parallel. Google "python multithreading" will help you to get lots of examples.

Model parallelism in TensorFlow multi-gpu training

I am training a model in several GPUs on a single machine using tensorflow. However, I find the speed is much slower than training on a single GPU. I am wondering if tensorflow executes sub-model in different GPUs in parallel or in a sequential order. For example:
x = 5
y = 2
with tf.device('/gpu:0'):
z1 = tf.multiply(x, y)
with tf.device('/gpu:1'):
z2 = tf.add(x, y)
Are the code inside /gpu:0 and /gpu:1 executes sequentially? If in sequential order, how can I make the two parts execute in parallel? Assume the two parts are not dependent on each other.
In TensorFlow only the second block (inside gpu:1) would execute since nothing depends on the first block.
Yes it is executing sequentially, by its nature the with block will wait till its computation is complete before moving to next code block.
You can implement queues and threading from tensorflow to leverage your additional compute.
Please refer thsi tutorial from tensorflow:
https://www.tensorflow.org/api_guides/python/threading_and_queues

Merge weights of same model trained on 2 different computers using tensorflow

I was doing some research on training deep neural networks using tensorflow. I know how to train a model. My problem is i have to train the same model on 2 different computers with different datasets. Then save the model weights. Later i have to merge the 2 model weight files somehow. I have no idea how to merge them. Is there a function that does this or should the weights be averaged?
Any help on this problem would be useful
Thanks in advance
There is literally no way to merge weights, you cannot average or combine them in any way, as the result will not mean anything. What you could do instead is combine predictions, but for that the training classes have to be the same.
This is not a programming limitation but a theoretical one.
It is better to merge weight updates (gradients) during the training and keep a common set of weights rather than trying to merge the weights after individual trainings have completed. Both individually trained networks may find a different optimum and e.g. averaging the weights may give a network which performs worse on both datasets.
There are two things you can do:
Look at 'data parallel training': distributing forward and backward passes of the training process over multiple compute nodes each of which has a subset of the entire data.
In this case typically:
each node propagates a minibatch forward through the network
each node propagates the loss gradient backwards through the network
a 'master node' collects gradients from minibatches on all nodes and updates the weights correspondingly
and distributes the weight updates back to the compute nodes to make sure each of them has the same set of weights
(there are variants of the above to avoid that compute nodes idle too long waiting for results from others). The above assumes that Tensorflow processes running on the compute nodes can communicate with each other during the training.
Look at https://www.tensorflow.org/deploy/distributed) for more details and an example of how to train networks over multiple nodes.
If you really have train the networks separately, look at ensembling, see e.g. this page: https://mlwave.com/kaggle-ensembling-guide/ . In a nutshell, you would train individual networks on their own machines and then e.g. use an average or maximum over the outputs of both networks as a combined classifier / predictor.

Train two models on two GPUs

I have two models (theano scripts) I want to train and evaluate.
I have two GPUs I can use to train them.
How can I run a model on each GPU at the same time?
When running your script you can choose where your progam will run with THEANO_FLAGS:
THEANO_FLAGS='device=gpu0' python script_1.py
THEANO_FLAGS='device=gpu1' python script_2.py
change gpuX for each gpu (eg. gpu0,gpu1,gpu2...)