How to choose GPU when running phoronix-test-suite benchmark? - gpu

I am new to Phoronix Test Suite and ran my first test with phoronix-test-suite benchmark testname. This ran the test for one of my GPUs but not the other. How can I choose which GPU to use for the benchmark?
I've searched Google and skimmed the documentation for an answer but found nothing.
EDIT
The test I am trying to run is here, using
phoronix-test-suite benchmark 2102179-HA-NVIDIAGEF76
I've also tried using the method described here but to no avail.
I am using Phoronix Test Suite v10.2.2 (Harstad) on Ubuntu 20.04.2 LTS.
UPDATE
According to this issue, phoronix-test-suite always chooses the default GPU on a given system.
PTS currently sticks to using the default GPU configured by your system whether it be configured via PRIME handling or other multi-GPU setup configurations. Basically, it doesn't override your default GPU choice(s) or interfere beyond simply reporting the enumerated GPUs.
So the official way to change the GPU utilized by a phoronix benchmark is to change the 'default GPU' on the broader system. I don't understand what determines which GPU is the default or how to change the default. The above quote indicates that the default GPU might be changed using PRIME.
When running nvidia-settings the following message is printed.
** (nvidia-settings:9809): WARNING **: 15:46:41.950: PRIME: Failed to execute child process “/usr/bin/prime-supported” (No such file or directory)
** Message: 15:46:41.950: PRIME: is it supported? no
So it seems that whatever PRIME is, it's not part of my system.

As you were looking to configure an Nvidia GPU, the logic is slightly different:
looking at the source, PTS seems to always use the first GPU it can find on nvidia-settings --query PCIID output.
This theory has been further confirmed by PTS lead developer on github. So unfortunately there's no switch in PTS that would help achieve that.

This can be done if you are using a Nividea GPU you can go to the Nividea control panel:
Go to Manage 3D settings
Go to "Program Settings"
Select your app (i.e in this case Phoronix-test-suite benchmark) and select the high-performance Nividea GPU.
Now run the benchmark test. <---- For Windows
for more help visit: https://www.phoronix-test-suite.com/documentation/phoronix-test-suite.pdf

Related

Does tensorflow-quantum support GPU, and if so how do I make it use mine?

I am getting started on using tensorflow-quantum for some QML circuit simulations. I have everything configured correctly for TensorFlow with GPU, and when I run print(tf.config.list_physical_devices('GPU')), it reports the presence of my GPU.
However, I've done some Googling, and I've come across a few things suggesting that tensorflow-quantum doesn't actually support GPU acceleration for simulations (e.g. MichaelBroughton's first reply here, and this issue which is still open). However, it's unclear to me how up-to-date this state of affairs is. I can't find anything about adding GPU support in the version notes.
Does tensorflow-quantum currently support GPU? If so, how do I (a) make it use my GPU for simulations and (b) verify that it is doing so?

How to define multiple gres resources in SLURM using the same GPU device?

I'm running machine learning (ML) jobs that make use of very little GPU memory.
Thus, I could run multiple ML jobs on a single GPU.
To achieve that, I would like to add multiple lines in the gres.conf file that specify the same device.
However, it seems the slurm deamon doesn't accept this, the service returning:
fatal: Gres GPU plugin failed to load configuration
Is there any option I'm missing to make this work?
Or maybe a different way to achieve that with SLURM?
It is kind of smiliar to this one, but this one seems specific to some CUDA code with compilation enabled. Something which seems way more specific than my general case (or at least as far as I understand).
How to run multiple jobs on a GPU grid with CUDA using SLURM
I don't think you can oversubscribe GPUs, so I see two options:
You can configure the CUDA Multi-Process Service or
pack multiple calculations into a single job that has one GPU and run them in parallel.
Besides nVidia MPS mentioned by #Marcus Boden, which is relevant for V100 types of cards, there also is Multi-Instance GPU which is relevant for A100 types of cards.

USRP N210 overflows in virtual machine using GnuRadio

I am using the USRP N210 through a Debian (4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u1) VM and run very quickly into processing overflow. GnuRadio-Companion is printing the letter "D" the moment one of the CPUs load is reaching 100 %. This was tested by increasing the number of taps for a low-pass filter, as shown in the picture with a sampling rate of 6.25 MHz.
I have done all instructions on How to tune an USRP, except the CPU governor. This is because I am not able to do this due to a missing driver reported by cpufreq-info. The exact output is
No or unknown cpufreq driver is active on this CPU.
The output of the lscpu command is also shown in a picture.
Has anyone an idea how I can resolve the problem? Or is GnuRadio just not fully supported for VMs?
Dropping packets when your CPU can't keep up is expected. It's the direct effect of that.
The problem is most likely to be not within your VM, but with the virtualizer.
Virtualization adds some overhead, and whilst modern virtualizers have gotten pretty good at it, you're requesting that
an application with hard real-time requirements runs
under high network load.
This might take away CPU cycles on your host side that your VM doesn't even know of – your 100% is less than it looks!
So, first of all, make sure your virtualizer does as little things with the network traffic as possible. Especially, no NAT, but best-case hardware bridging.
Then, the freq-xlating FIR definitely isn't the highest-performing block. Try using a rotator instead, followed by an FFT FIR. In your case, let that FIR decimate by a factor of 2 – you've done enough low-pass filtering to reduce the sampling rate without getting aliases.
Lastly, might be a good idea to use a newer version of GNU Radio. In Debian testing, apt will get you a 3.8 release series GNU Radio.

Would a Vulkan program run on a device without gpu (discrete or integrated)?

Perhaps this question could be rephrased as 'what would happen if I were to try and run a Vulkan program on a cpu-only build'.
I'm wondering whether the program would run but not produce output, crash or not build in the first place (although I expect the building process to be for a cpu architecture instead of a gpu architecture).
Would it use the on-motherboard graphics to produce output? In that case, what would happen if the program was run on a cpu-only server?
Depends on how the program initialized vulkan.
Any build can have the vulkan loader installed this is the dynamically loaded library that finds the actual driver, if that is missing the program would be unable to load the loader and may either fail to start or show an error message, depending on how they try and load that.
If no device is available then the number of devices is 0. This is again up to the application to manage. Either by going for an alternative graphics API (opengl) or a error message and failing to start.

Google-colaboratory: No backend with GPU available

Here it is described how to use gpu with google-colaboratory:
Simply select "GPU" in the Accelerator drop-down in Notebook Settings (either through the Edit menu or the command palette at cmd/ctrl-shift-P).
However, when I select gpu in Notebook Settings I get a popup saying:
Failed to assign a backend
No backend with GPU available. Would you like to use a runtime with no accelerator?
When I run:
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))
Of course, I get GPU device not found. It seems the description is incomplete. Any ideas what needs to be done?
You need to configure the Notebook with GPU device
Click Edit->notebook settings->hardware accelerator->GPU
You'll need to try again later when a GPU is available. The message indicates that all available GPUs are in use.
The FAQ provides additional info:
How may I use GPUs and why are they sometimes unavailable?
Colaboratory is intended for interactive use. Long-running background
computations, particularly on GPUs, may be stopped. Please do not use
Colaboratory for cryptocurrency mining. Doing so is unsupported and
may result in service unavailability. We encourage users who wish to
run continuous or long-running computations through Colaboratory’s UI
to use a local runtime.
There seems to be a cooldown on continuous training with GPUs. So, if you encounter the error dialog, try again later, and perhaps try to limit long-term training in subsequent sessions.
Add some pictures to make it clearer
My reputation is just slightly too low to comment, but here's a bit of additional info for #Bob Smith's answer re cooldown period.
There seems to be a cooldown on continuous training with GPUs. So, if you encounter the error dialog, try again later, and perhaps try to limit long-term training in subsequent sessions.
Based on my own recent experience, I believe Colab will allocate you at most 12 hours of GPU usage, after which there is roughly an 8 hour cool-down period before you can use compute resources again. In my case, I could not connect to an instance even without a GPU. I'm not entirely sure about this next bit but I think if you run say 3 instances at once, your 12 hours are depleted 3 times as fast. I don't know after what period of time the 12 hour limit resets, but I'd guess maybe a day.
Anyway, still missing a few details but the main takeaway is that if you exceed you'll limit, you'll be locked out from connecting to an instance for 8 hours (which is a great pain if you're actively working on something).
After Reset runtime didn't work, I did:
Runtime -> Reset all runtimes -> Yes
I then got a happy:
Found GPU at: /device:GPU:0
This is the precise answer to your question man.
According to a post from Colab :
overall usage limits, as well as idle timeout periods, maximum VM
lifetime, GPU types available, and other factors, vary over time.
GPUs and TPUs are sometimes prioritized for users who use Colab
interactively rather than for long-running computations, or for users
who have recently used less resources in Colab. As a result, users who
use Colab for long-running computations, or users who have recently
used more resources in Colab, are more likely to run into usage limits
and have their access to GPUs and TPUs temporarily restricted. Users
with high computational needs may be interested in using Colab’s UI
with a local runtime running on their own hardware.
Google Colab has by default tensorflow 2.0, Change it to tensorflow 1. Add the code,
%tensorflow_version 1.x
Use it before any keras or tensorflow code.