Architecture of Nvidia driver open source code

Architecture of Nvidia driver open source code - gpu

I was looking at the half open - half closed source code of Nvidia's Linux device driver. I am trying to understand how it manages user space memory allocation on the GPU side for CUDA programs. I think I have found the module responsible for it (uvm8_pmm_gpu.c, which interacts with PMA which is closed source. PMA is physical memory allocator and PMM is physical memory manager), which seems to have a sort of buddy allocator but I can't make the driver execute this code no matter what CUDA code I run. So my question is:
1) Has someone looked at the code of Nvidia GPU drivers? Is there a wiki somewhere where someone has posted about it's architecture?
2) Do anyone knows how the driver manages memory allocation?

Related

How can i "attach" nvidia gpu card to xp vm in vmware?

I have an old Nvidia gpu card that basically goes unused on my host. So i wanted to attach it to a VMware vm ( Win XP ) and use it the same way i can use a usb stick. How difficult is it to write a device driver to hook VMW's API with Nvidia's API? What open source device drivers are there to do this? Its not for graphics processing but splitting parallel computations up to run faster, hence Mesa project not the right direction...

Use GPU and computer resources in pararalell

If my GPU runs out of memory while running tensorflow is there any way to use in parallel the computer memory, for example while compiling tensorflow is there any configuration.
I read in this link the section of "Allowing GPU memory growth" but didn't saw anything about share resources.
Thanks in advance

Vulkan SDK setup: vkEnumerateInstanceExtensionProperties failed to find the VK_KHR_surface extension

I tried to run the Vulkan cube example after downloading the Vulkan SDK but get the following
vkEnumerateInstanceExtensionProperties failed to find the VK_KHR_surface extension.
Do you have a compatible Vulkan installable client driver (ICD) installed?
I have a Nvidia GK107M [Geforce GT 755M] graphics card.
Regarding the graphics driver, the output of
lshw -c video | grep 'configuration' is
configuration: driver=nvidia latency=0
configuration: driver=i915 latency=0
And when I see through the driver manager it shows Nvidia-352 graphics driver. Earlier I was using Nouveau display driver which I disabled thinking that it might not support Vulkan and the Nvidia driver would. But still the same thing persists.
On running .\vulkaninfo I got a message saying that vulkan instance creation failed with VK_ERROR_INCOMPATIBLE_DRIVER.
P.S: I am using the latest Vulkan SDK releases today only. I am going to try the older SDK versions. Maybe they would work.
P.P.S: I have run into a black /blank screen issue after updating Nvidia driver to 370 and rebooting.

Optimus. Well, there you have it. To quote directly from the driver package documents:
Some designs incorporating supported GPUs may not be compatible with the NVIDIA Linux driver: in particular, notebook and all-in-one desktop designs with switchable (hybrid) or Optimus graphics will not work if means to disable the integrated graphics in hardware are not available. Hardware designs will vary from manufacturer to manufacturer, so please consult with a system's manufacturer to determine whether that particular system is compatible.
So, you need to disable it (in BIOS) if possible (as it says above).
Or get updated driver from the notebook manufacturer (Well, as much chance as seeing Android update on chinatablet. If they even bother with linux support.).
Or expect exactly the kind of problems and hackery with no guaranteed success you face.
The v352 driver you have wouldn't support Vulkan. It is older than Vulkan.
Nouveau to my knowledge doesn't support Vulkan (yet) either.

There's 3 places that the Vulkan loader looks to find a Linux driver's JSON definition file:
/etc/vulkan/icd.d
/usr/share/vulkan/icd.d
And wherever you define "VK_DRIVERS_PATH" to.
If you don't have a JSON in one of those locations for your Nvidia driver that would be a problem.
Secondly, if you do have the JSON file, but it's "library_path" entry doesn't point to a valid driver, that would also not work.
Try looking for those files.

VxWorks porting(DM8168)

I have Spectrum Digital evaluation board (evm816x).
I have the problem, when i'm trying to port vxWorks 6.9 to the TMS320DM8168(davinci).
I load u-boot to NAND, it starts, all okey. Then, I load vxWorks image with xds510 usb emulator. All okey, vxWorks works good. Then, i'm trying to start vxWorks from u-boot, its crashing through initialization process.
After a few experiments I came to conclusion that vxWorks start only after CPU reset.
What prevents loading vxWorks in CPU?
Thank you.

Generally VxWorks 6.x BSPs are not designed to work with U-Boot. You may encounter random crash using the U-Boot go/bootelf/bootvx command after loading the VxWorks kernel. The reasons behind this might be different, for example it might be due to disagreement with the physical memory address configured in U-Boot, or inconsistent cache/MMU state.
The latest VxWorks 7 supports U-Boot as the bootloader by default on ARM and PPC targets.
Patches are now in the mainstream of the U-Boot Git repo since U-Boot v2014.01 relesae.

There may be bootable and loadable vxworks images. You are probably run loadable image. That is the default option to build vxworks in workbench. That image expects some initialization to be done by bootloader (which is bootable vxworks that runs the "boottask", which in turn loads the vxworks image).
In short, try to build bootable/romable vxworks image and to load it. Otherwise load the bootloader (bootrom) image which will load your loadable vxworks image.

OpenCl: Minimal configuration to work with AMD GPU

Suppose we have AMD GPU (for example Radeon HD 7970) and minimal linux system without X and etc.
What should be installed and what should be launched and how it should be launched to have proper OpenCL environment? In best case it should be headless environment.
Requirements to environment:
GPU visible by OpenCL programs (clinfo for example)
It is possible to monitor temperature and set fan speed (for example using aticonfig).
P.S. Simple install Xserver, catalyst and run X :0 won't work properly. See X server with fglrx driver won't responce after exactly 49 accesses to X server
UPD When you use AMD GPU on linux, OpenCL applications don't see AMD GPU if Xserver isn't launched.

I had similar problem, asked a question and had succeed solving it by myself.
For R9 290 cards and newer i assume you have:
Built kernel 4.14 or later, with amdgpu driver support. There is option in linux kernel config under Graphics Support.
All nesesary firmware .bin blobs are incorporated. To do so easily you may edit buildroot/package/linux-firmware/* contents for buildroot, and manually add BR2_PACKAGE_LINUX_FIRMWARE_AMDGPU option by yourself, along with BR2_PACKAGE_LINUX_FIRMWARE_RADEON (use it as a template). Actually we should post that update to their git.
When booting you should see appropriate dmesg messages about amdgpu initializing, per each adapter. And screen mode should be switched. If you still see large console text and no videomode switch occured during init then you have problem in kernel/firmware, you should fix that out first.
To answer second question, controlling fan speeds/temperatures is achieved via powerplay filesystem, eg /sys/class/drm/.. like this:
cd sys/class/drm/card0/device/hwmon/hwmon0
echo 1 > pwm1_enable
cat pwm1_max > pwm1
You may dig a bit deeper and find powertune parameters nearby, in device folder.
But instead of using /sys/class/drm/card0/device/pp_dpm_sclk i highly recommend flashing that values directly in cards' bios. Set with required frequencies/voltages, as it is more reliable, stable and api independent - you either init it, or not :)
PS. Also put away 7970, buy something a bit newer. I dont know if it is still supported in the latest drivers, we havent such an old card by hands right now. I tested 290, 390, 480, 580 cards series. (for R9 270, miner fails to build cl code). For older cards better to use some older software <=16.40 and maybe a bit older kernel <=4.13

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas