NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver 1 - google-colaboratory

Good afternoon. NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running. - such an inscription appears despite the fact that it is subscribed to colab pro. My settings are correct. Help me, I don't know where to write.

Related

[Win11]GPU appears in device manager but not in task manager or nvidia controller

I tried to use the PCIE m.2 port to attach the RTX4090. I successfully installed the driver for it, and the device manager says it works properly. However, the nvidia-smi couldn't find it. Neither tensorflow and torch could recognize it as cuda device. My laptop is MSI GE66 with Core i9-11980KH and RTX3080(Laptop)(previously disabled). If I use thunderbolt 4 to connect to my GPU, it works fine.

How to install GPU driver on Google Deep Learning VM?

I just created a Google Deep Learning VM with this image:
c1-deeplearning-tf-1-15-cu110-v20210619-debian-10
The tensorflow version is 1.15.5. But when I run
nvidia-smi
it says -bash: nvidia-smi: command not found.
When I run
nvcc --version
I got
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Thu_Jun_11_22:26:38_PDT_2020
Cuda compilation tools, release 11.0, V11.0.194
Build cuda_11.0_bu.TC445_37.28540450_0
Does anyone know how to install the GPU driver? Thank you in advance!
Update: I've noticed that if you select GPU instance, then the GPU driver is pre-installed.
This is the guide: Installing GPU drivers.
Required NVIDIA driver versions
NVIDIA GPUs running on Compute Engine must use the following NVIDIA driver versions:
For A100 GPUs:
Linux : 450.80.02 or later
Windows: 452.77 or later\
For all other GPU types:
Linux : NVIDIA 410.79 driver or later
Windows : 426.00 driver or later
I would suggest to delete the instance and create another one. Keep in mind the version compatibility here and here. If you are installing drivers by yourself then whats the point of using pre-build instance.

Blender 2.9 Could not find a matching GPU name warning on Chromebook

I'm using an Asus Chromebook with a CPU(I think).
This is what the Error says:
Warning: Could not find a matching GPU name. Things may not behave as expected.
Detected OpenGL configuration:
Vendor: Red Hat
Renderer: virgl
/run/user/1000/gvfs/ non-existent directory
found bundled python: /home/sekhong5417/blender/2.90/python
This works on my Friend's Chromebook who has a GPU.
Also I am kinda young so I can't replace anything or buy a new device.
There are images at the bottom
If anyone still runs into this issue, there is an incompatibility with Blender and Intel ChromeOS GPU drivers.
See https://developer.blender.org/T77651#1172666 for more details and an updated working build of v2.93.
Hopefully, the fix gets included in the next release.
I use Acer Chromebook spin 13 and I just met the same issue with you. I think it is maybe the Debian within Chromebook don't have the driver that matches the Intel GPU. My Chromebook uses Intel HD graphics 620. I tried many ways to install the driver but they all failed. Linux works easier with Nvidia GPU though. So my idea is you can try to find intel a drive which matches your Graphic card and try again.

Caffe and Tensorflow on a Dell 7559 with nvidia optimus technology

I bought a dell 7559 laptop for deep learning. I got ubuntu 16.04 installed on it but I am having trouble getting caffe and tensorflow on it. The laptop used Nvidia Optimus technology to switch between gpu and cpu to save battery usage. I checked the bios to see if I can set it to use only gpu but there is no option for it. Using bumblebee or nvidia-prime didnt work either. I now have ubuntu 16 with mate desktop environment it is preventing from getting the black screen but didnt help with the cuda issue. I was able to install the drivers and cuda but when I build caffe and tensorflow they fail saying that it didnt detect a gpu. And I wasnt able to install opengl. I tried using several versions of nvidia drivers but it didnt help. Any help would be great. thanks.
I think Bumblebee can enable you to run Caffe/Tensorflow in GPU mode. More generally, it also allows you to run other CUDA programs on a laptop with Optimus technology .
When you have installed Bumblebee correctly (tutorial: Bumblebee Wiki for Ubuntu ), you can invoke the Caffe binary by pepending optirun before the caffe binary. So it goes like the following:
optirun ../../caffe-master/build/tools/caffe train --solver=solver.prototxt
This works for the NVidia DIGITS server as well:
optirun ./digits-devserver
In addition, Bumblebee also works on my dual-graphics desktop PC (Intel HD 4600 + GTX 750 Ti) as well. The display on my PC is driven by the Intel HD 4600 through the HDMI port on the motherboard. The NVidia GTX 750 Ti is only used for CUDA programs.
In fact, for my desktop PC, the "nvidia-prime" (it's actually invoked through the command line program prime-select) is used to choose the GPU that drives the desktop. I have the integrated GPU connect to the display with the HDMI port and the NVidia GPU through a DisplayPort. Currently, the DisplayPort is inactive. The display signal comes from the HDMI port.
As far as I understand, PRIME does so by modifying /etc/X11/Xorg.conf to make either the Intel integrated GPU or the NVidia GPU the current display adapter available to X. I think the PRIME settings only makes sense when both GPUs are connected to some display, which means there need not be an Optimus link between the two GPUs like in a laptop (or, for a laptop with a Mux such as Dell Precision M4600, the Optimus is disabled in BIOS).
More information about the Display Mux and Optimus may be found here: Using the NVIDIA Driver with Optimus Laptops
Hope this helps!

TensorFlow isn't using Nvidia

TensorFlow fails to use nvidia card though nvidia driver, cuda toolkit, cudnn installed and configured.
One thing that I suspect is the reason is the nvidia card on my laptop is connected to pci as 3d controller instead of VGA:
00:02.0 VGA compatible controller: Intel Corporation Sky Lake Integrated Graphics (rev 07)
Subsystem: ASUSTeK Computer Inc. Skylake Integrated Graphics
Kernel driver in use: i915_bpo
01:00.0 3D controller: NVIDIA Corporation GK208M [GeForce 920M] (rev a1)
Subsystem: ASUSTeK Computer Inc. GK208M [GeForce 920M]
Kernel modules: nvidiafb, nouveau, nvidia_304
Even the Nvidia xserver settings don't see the GPU:
Is this true that tensorflow can only use the graphic card as VGA?
After three month, I finally figured out even first what the issue is and resolved it. It turned out to be a nvidia issue with Secure Boot.
Feel obliged to thank jorgemf and Yao Zhang for your help at a time I couldn't even good articulate the problem.
Meanwhile I hope my case can help other people having a same problem.
All started with my attempt to install nvidia driver again today. The installation seemed successful but in the end, it says,
Unable to load the “nvidia-drm” kernel module.
So I thought maybe I could manually load the kernel with
modprobe mvidia-drm
but got an error says something like "required key not applicable". Wonder what that meant so googled a bit. It turned out to be application not registered! So that module has been stopped by Secure Boot!
Went back to boot settings and disabled secure boot. Installed nvidia driver again, successful! Now in Nvidia settings it looks like this:
See now the gpu device shows there.
Head further to install cuda and cudnn. Found this github gist super useful: https://gist.github.com/wangruohui/df039f0dc434d6486f5d4d098aa52d07
Last step, just followed the installation on Tensorflow home page. Tested it did run on GPU!
The take-home message is if you fail to install Nvidia driver on linux system, you probably need to disable Secure Boot. Personal opinion, Windows turned this good idea into a nightmare for linux users!