Tensorflow get the default device name - tensorflow

In Tensorflow 1.14 I'm trying to use tf.data.experimental.prefetch_to_device(device=...) to prefetch my data to the GPU. But I'm not always training on a GPU, I often times train on a CPU (especially during development).
Is there a way to get the current default device in use? Tensorflow either picks the CPU (when I set CUDA_VISIBLE_DEVICES=-1) otherwise it'll pick the GPU, the default usually works.
So far I can only find a way to list visible devices with sess.list_devices(), but there must be a way to query the current default device so I don't have to manually change it in prefetch_to_device every time, right?

There is no API way of doing what you said currently. The closest
device = 'gpu:0' if tf.test.is_gpu_available() else 'cpu'
is what you have already said.
The reasons I think so is that the allocation is done at a low level: https://github.com/tensorflow/tensorflow/blob/cf4dbb45ffb4d6ea0dc9c2ecfb514e874092cd16/tensorflow/core/common_runtime/colocation_graph.cc
Maybe you can also try with soft placement
Hope it helps.

The best solution I've found so far is to use tf.test.is_gpu_available:
device = 'gpu:0' if tf.test.is_gpu_available() else 'cpu'

Related

Does tensorflow-quantum support GPU, and if so how do I make it use mine?

I am getting started on using tensorflow-quantum for some QML circuit simulations. I have everything configured correctly for TensorFlow with GPU, and when I run print(tf.config.list_physical_devices('GPU')), it reports the presence of my GPU.
However, I've done some Googling, and I've come across a few things suggesting that tensorflow-quantum doesn't actually support GPU acceleration for simulations (e.g. MichaelBroughton's first reply here, and this issue which is still open). However, it's unclear to me how up-to-date this state of affairs is. I can't find anything about adding GPU support in the version notes.
Does tensorflow-quantum currently support GPU? If so, how do I (a) make it use my GPU for simulations and (b) verify that it is doing so?

I can't use gpu in Colab pro

enter image description here
enter image description here
I am using colab pro. About 4 months ago, I experienced slow learning of the tensorflow model. The learning speed is so slow, and as a result of checking it myself today, I was able to confirm that the gpu was detected normally, but the GPU POWER was off. The volatile GPU Util is also allocated as 0 , but it looks like the GPU is not being utilized for training. When I looked for the cause, there was a saying that the data I/O bottleneck was, so I also modified the DATALOADER, and when I ran the same code and dataset in a different COLAB account, I was able to see that the GPU allocation worked well and the time was also shortened. If there is a problem with the os settings or if there is something I need to fix, please let me know. have a good day
I figured out that the problem was simply a path problem. As we've gotten feedback before, it seems like there's been a bottleneck in loading images through folders.
It was solved by specifying the path of the dataset as content/ .

What does it do if I choose "None" in Hardware Accelerator?

Pretty straightforward question. I was just wondering what was doing the calculus when this option was chosen. Does it run on Goolge's CPU or on my hardware ?
I have looked on Google, Stackoverflow and Colab's Help without success finding a precise answer
Thanks :)
PS : When running a full Dense Network "without" accelarator it is approx. as fast as with TPU and a lot faster than with GPU.
Your guess is correct: None means CPU only, but on a Colab-managed cloud VM rather than your local machine. (Unless you've connected to a local Jupyter instance.
Also keep in mind that you'll need to adjust your code in order to take advantage of hardware accelerators like GPUs and TPUs.
Speedup on a GPU is often a bit magical since many frameworks automatically detect and take advantage of GPUs. Built-in support for TPUs is rare, and obtaining a speedup from TPUs will require adjusting your code.

Google-colaboratory: No backend with GPU available

Here it is described how to use gpu with google-colaboratory:
Simply select "GPU" in the Accelerator drop-down in Notebook Settings (either through the Edit menu or the command palette at cmd/ctrl-shift-P).
However, when I select gpu in Notebook Settings I get a popup saying:
Failed to assign a backend
No backend with GPU available. Would you like to use a runtime with no accelerator?
When I run:
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))
Of course, I get GPU device not found. It seems the description is incomplete. Any ideas what needs to be done?
You need to configure the Notebook with GPU device
Click Edit->notebook settings->hardware accelerator->GPU
You'll need to try again later when a GPU is available. The message indicates that all available GPUs are in use.
The FAQ provides additional info:
How may I use GPUs and why are they sometimes unavailable?
Colaboratory is intended for interactive use. Long-running background
computations, particularly on GPUs, may be stopped. Please do not use
Colaboratory for cryptocurrency mining. Doing so is unsupported and
may result in service unavailability. We encourage users who wish to
run continuous or long-running computations through Colaboratory’s UI
to use a local runtime.
There seems to be a cooldown on continuous training with GPUs. So, if you encounter the error dialog, try again later, and perhaps try to limit long-term training in subsequent sessions.
Add some pictures to make it clearer
My reputation is just slightly too low to comment, but here's a bit of additional info for #Bob Smith's answer re cooldown period.
There seems to be a cooldown on continuous training with GPUs. So, if you encounter the error dialog, try again later, and perhaps try to limit long-term training in subsequent sessions.
Based on my own recent experience, I believe Colab will allocate you at most 12 hours of GPU usage, after which there is roughly an 8 hour cool-down period before you can use compute resources again. In my case, I could not connect to an instance even without a GPU. I'm not entirely sure about this next bit but I think if you run say 3 instances at once, your 12 hours are depleted 3 times as fast. I don't know after what period of time the 12 hour limit resets, but I'd guess maybe a day.
Anyway, still missing a few details but the main takeaway is that if you exceed you'll limit, you'll be locked out from connecting to an instance for 8 hours (which is a great pain if you're actively working on something).
After Reset runtime didn't work, I did:
Runtime -> Reset all runtimes -> Yes
I then got a happy:
Found GPU at: /device:GPU:0
This is the precise answer to your question man.
According to a post from Colab :
overall usage limits, as well as idle timeout periods, maximum VM
lifetime, GPU types available, and other factors, vary over time.
GPUs and TPUs are sometimes prioritized for users who use Colab
interactively rather than for long-running computations, or for users
who have recently used less resources in Colab. As a result, users who
use Colab for long-running computations, or users who have recently
used more resources in Colab, are more likely to run into usage limits
and have their access to GPUs and TPUs temporarily restricted. Users
with high computational needs may be interested in using Colab’s UI
with a local runtime running on their own hardware.
Google Colab has by default tensorflow 2.0, Change it to tensorflow 1. Add the code,
%tensorflow_version 1.x
Use it before any keras or tensorflow code.

TensorFlow operations with GPU support

Is there a way (or maybe a list?) to determine all the tensorflow operations which offer GPU support?
Right now, it is a trial and error process for me - I try to place a group of operations on GPU. If it works, sweet. If not, try somewhere else.
This is the only thing relevant (but not helpful) I have found so far: https://github.com/tensorflow/tensorflow/issues/2502
Unfortunately, it seems there's no built-in way to get an exact list or even check this for a specific op. As mentioned in the issue above, the dispatching is done in the native C++ code: a particular operation can be assigned to GPU if a corresponding kernel has been registered to DEVICE_GPU.
I think the easiest way for you is to grep "REGISTER_KERNEL_BUILDER" -r tensorflow the tensorflow source base to get a list of matched operations, which will look something like this.
But remember that even with REGISTER_KERNEL_BUILDER specification, there's no guarantee an op will be performed on a GPU. For example, 32-bit int Add is assigned on CPU regardless of the existing kernel.