I need a system for OpenCL programming with the following restrictions:
The discrete GPU must not run as a display card --> I can do that
from BIOS
The internal GPU of the AMD's APU must be used as display GPU --> I can do that
from BIOS
OpenCL must not recognize the internal APU's GPU and must always
default to the discrete GPU
Why do I need this?
It is because I am working on a GPU code that demands the GPU's BIOS
to be flashed and a custom BIOS to be installed, which makes the GPU
unusable for display.
AMD boards can't boot without VGA card so I am getting an APU that
has internal GPU.
The code base I am working on can't deal with conflicting GPUs so I
need to disable that (APU's GPU) from OpenCL seeing it.
How can I approach it?
According to the AMD OpenCL Programming Guide, AMD's drivers support the GPU_DEVICE_ORDINAL environment variable to configure which devices are used (Section 2.3.3):
In some cases, the user might want to mask the visibility of the GPUs seen by
the OpenCL application. One example is to dedicate one GPU for regular graphics operations and the other three (in a four-GPU system) for Compute. To
do that, set the GPU_DEVICE_ORDINAL environment parameter, which is a comma-separated
list variable:
Under Windows: set GPU_DEVICE_ORDINAL=1,2,3
Under Linux: export GPU_DEVICE_ORDINAL=1,2,3
You'll first need to determine the ordinal for the devices you want to include. For this, I would recommend using clinfo with the -l switch, which will give you a basic tree of the available OpenCL platforms and devices. If the devices are listed with the APU first and then the dedicated GPU, you would want to only enable device 1 (the GPU), and would set the environment variable to GPU_DEVICE_ORDINAL=1.
Related
I want to detect the AMD gpu deneration in python code. My case is that to run specific application (davinci resolve), it is required to use amdgpu pro drivers for gpu cards before Vega. And amdgpu pro drivers are not required when AMD gpu is Vega or newer generation. See the list of amd gpus in wikipedia. I am writing a script (davinci-resolve-checker) that tells user if he/she need to use that driver.
The question is, how do I differentiate between gpu generations/chip codenames of a gpu? I am using pylspci to get information of the presented gpus. Is there a list of generations that I can check with?
There is a pci id list published here: https://pci-ids.ucw.cz/v2.2/pci.ids
In that list for Advanced Micro Devices, Inc. [AMD/ATI] (1002 vendor id) you can see their devices. For example, for AMD Radeon PRO W6600 GPU there is 73e3 Navi 23 WKS-XL [Radeon PRO W6600] line.
There you can see if the device name contains the codename substring. In this case it is Navi.
For the specified question, the codenames that currently describes vega and above are: Vega, Navi. So if the device name does not contain that substring, I consider it as "older than vega".
For programming that, you do not actually need the list, as you can just take the device name from VerboseParser device.device.name. But just in case, this list is located in /usr/share/hwdata/pci.ids in the system.
Probably, it is not a very reliable way. But I did not yet found a better way. Any suggestions are welcome.
I'm using Local device configuration in Tensorflow 2.3.0 currently, to simulate multiple GPU training, and it is working. If I buy another GPU, will I be able to use the same functionality to each GPU?
Right now I have 4 virtual GPUs and one physical GPU. I want to buy another GPU and want to have 2x4 virtual GPUs. I haven't found any information about it, and because I don't have another GPU right now, I can't test it. Is it supported? I'm afraid, it's not.
Yes, you can have additional GPU, since there is no restriction in the number of GPU's you can make use of all the GPU devices you have.
As you can see in the document also which says,
A visible tf.config.PhysicalDevice will by default have a single
tf.config.LogicalDevice associated with it once the runtime is
initialized. Specifying a list of tf.config.LogicalDeviceConfiguration
objects allows multiple devices to be created on the same
tf.config.PhysicalDevice
You can follow this documentation for more details on usage of multiple GPU's.
I have an NVIDIA RTX 2070 GPU and CUDA installed, I have WebGL support, but when I run the various TFJS examples, such as the Addition RNN Example or the Visualizing Training Example, I see my CPU usage go to 100% but the GPU (as metered via nvidia-smi) never gets used.
How can I troubleshoot this? I don't see any console messages about not finding the GPU. The TFJS docs are really vague about this, only saying that it uses the GPU if WebGL is supported and otherwise falls back to CPU if it can't find the WebGL. But again, WebGL is working. So...how to help it find my GPU?
Other related SO questions seem to be about tfjs-node-gpu, e.g., getting one's own tfjs-node-gpu installation working. This is not about that.
I'm talking about running the main TFJS examples on the official TFJS pages from my browser.
Browser is the latest Chrome for Linux. Running Ubuntu 18.04.
EDIT: Since someone will ask, chrome://gpu shows that hardware acceleration is enabled. The output log is rather long, but here's the top:
Graphics Feature Status
Canvas: Hardware accelerated
Flash: Hardware accelerated
Flash Stage3D: Hardware accelerated
Flash Stage3D Baseline profile: Hardware accelerated
Compositing: Hardware accelerated
Multiple Raster Threads: Enabled
Out-of-process Rasterization: Disabled
OpenGL: Enabled
Hardware Protected Video Decode: Unavailable
Rasterization: Software only. Hardware acceleration disabled
Skia Renderer: Enabled
Video Decode: Unavailable
Vulkan: Disabled
WebGL: Hardware accelerated
WebGL2: Hardware accelerated
Got it essentially solved. I found this older post, that one needs to check whether WebGL is using the "real" GPU or just some Intel-integrated-graphics offshoot of the CPU.
To do this, go to https://alteredqualia.com/tmp/webgl-maxparams-test/ and scroll down to the very bottom and look at the Unmasked Renderer and Unmasked Vendor tag.
In my case, these were showing Intel, not my NVIDIA GPU.
My System76 laptop has the capacity to run in "Hybrid Graphics" mode in which big computations are performed on the GPU but smaller things like GUI elements run on the integrated graphics. (This saves battery life.) But while some applications are able to take advantage of the GPU when in Hybrid Graphics mode -- I just ran a great Adversarial Latent AutoEncoder demo that maxed out my GPU while in Hybrid Graphics mode -- not all are. Chrome is one example of the latter, apparently.
To get WebGL to see my NVIDIA GPU, I needed to reboot my system in "full NVIDIA Graphics" mode.
After this reboot, some of the TFJS examples will use the GPU, such as the Visualizing Training example, which now trains almost instantly instead of taking a few minutes to train. But the Addition RNN example still only uses the CPU. This may be because of a missing backend declaration that #edkeveked pointed out.
I am using AMD Radeon Pro duo for my application in opencl.
It has a Dual Fiji GPUs, How can i configure Cross Fire to make them work as one device. I am using clgetdeviceinfo in opencl for checking the device compute units but it's showing 64 for each fiji GPU.
I have total 128 compute units in two GPUS, How to use all of them by using Crossfire.
OpenCL has device fission but not device fusion. Devices can share memory for efficiency but shaders can't be joined.
There are also some functions that can't synchronize between two GPUs yet:
Atomic functions in kernels
Prefetch command(which GPUs global cache?)
clEnqueueAcquireGLObject(which GPU's buffer?)
clCreateBuffer (which device memorry does it choose? we can't choose.)
clEnqueueTask (where does this task go?)
You should partition the encoding work in two pieces and run on both GPUs. This may even need cross-fire to be disabled if drivers have problems with it. This shouldn't be harder than writing a GPGPU encoder.
But you may need to copy data to only one of the devices, then copy half of data to other GPU from that buffer, instead of passing through pci-e twice. The inter-GPU connection must be faster than pci-e.
I have a system with a discrete GPU, AMD Radeon HD7850, for computations only. The GPU has no monitor connected to it.
I would like to read fan speed and temperature from the GPU. This can normally be done with the ADL (AMD Display Library) API.
E.g. ADL_Overdrive6_FanSpeed_Get and ADL_Overdrive6_Temperature_Get. However, all ADL API calls return error when no displays are active, i.e. no monitor is connected.
How do I read these values when the GPU has no monitor connected to it?
The AMD Catalyst Control Center has the same problem, it too can't read the values when the display is inactive.
I know the values are accessible because I can find them with the HWiNFO64.
After consulting AMD and the guys behind HWiNFO64 I have learned that the only way to get these values from a headless GPU is to read them directly from the GPU registers.
To do this you need to write your own driver, since AMD doesn't make an API available.