why tf.test.is_gpu_available() did not provide true or false,it got stuck? - tensorflow

After installing tensorflow gpu = 2.0.0 it got stuck after detecting gpu.
enviornment settings for this project is
ubuntu 18.04
cuda 10.0
cudnn 7.4.1
created a virtual enviornment
install tensorflow-gpu=2.0.0
While trying to check gpu with tf.test.is_gpu_available().compliation got stucked it is shown below.
enter image description here

changed cudnn version to 7.6.2.Then it works well.

Related

Upgrading Cudnn version in Vertex AI Notebook [Kernel Restarting Problem]

Problem: Cudnn version incompatiable with tensorflow and Cuda, Kernel dies and unable to start training in Vertex AI.
Current versions:
import tensorflow as tf
from tensorflow.python.platform import build_info as build
print(f"tensorflow version: {tf.__version__}")
print(f"Cuda Version: {build.build_info['cuda_version']}")
print(f"Cudnn version: {build.build_info['cudnn_version']}")
tensorflow version: 2.10.0
Cuda Version: 11.2
Cudnn version: 8
As per the information (shown in attached screenshot) available here, Cudnn version must be 8.1.
A similar question has been asked here that is related to upgrading Cudnn in Google colab. However, it does not solve my issue. Every other online sources are helpful for Anaconda environment only.
How can I upgrade the Cudnn in my case?
Thank you.
I tried several combinations of tensorflow, Cuda, and Cudnn versions in Google Colab and the following version worked [OS: Ubuntu 20.04]:
tensorflow version: 2.9.2
Cuda Version: 11.2
Cudnn version: 8
Therefore, I downgrated the tensorflow version in Vertex AI from 2.10.0 to 2.9.2 and it worked (solved only the incompatibility issue). I'm still searching the solution for Kernel restarting.
UPDATE::
The problem of Kernel Restatring got fixed after I changed the Kernel from Tensorflow 2 (Local) to Python (Local) in Vertex AI's Notebook as shown in the attached image [Kernel changing option is available on the right-top near the bug symbol].

cuda install for mask rcnn on ubuntu 18.04

I'm working with mask rcnn. I need a gpu to do real time work. When I look at the system requirements for this, it wants cuda 10.0 (tensorflow-gpu=1.15) for this, I installed ubuntu 18.04 and after installing cuda 10.0, when I reboot, the screen freezes at user login. Can you help me to solve this problem?

Unable to configure tensorflow to use GPU acceleration in Ubuntu 16.04

I am trying to install Tensorflow in Ubuntu 16.04 ( in google cloud ). What I have done so far is created an compute instance. I have added a NVIDIA Tesla K80 to this instance.
Also, made sure that the proper version of tensorflow ( version 1.14.0 ) is installed and
Cuda version of 8.0 is installed
and
CudNN version of 6.0 is installed as per the tensorflow gpu - cuda mapping
When I run a simple tensorflow program, I get
Cannot assign a device for operation MatMul: {{node MatMul}}was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0 ]. Make sure the device specification refers to a valid device.
Can anyone please let me know where I am doing wrong. Is the instance selection is correct?
Please do let me know and thanks for your help.
The CUDA and CudNN versions that have been tested with Tensorflow 1.14 are the 10.0 and the 7.4, respectively.
More information about version compatibility can be found here.

Installing Tensorflow with GPU support fails without any error in Ubuntu 16.04

This is how i configured the installation of Tensorflow -> screenshot
, at the end cuda libraries are not setting up.
I have Installed Cuda 8.0, cuDNN 5.1.10 and i have Nvidia GForce 1060 graphics card.Can anyone tell me the solution for this. Thanks in advance.
I am new to stackoverflow, if any mistakes kindly apologise me.

GKE - GPU nvidia - cuda drivers dont work

I have setup a kubernetes node with a nvidia tesla k80 and followed this tutorial to try to run a pytorch docker image with nvidia drivers and cuda drivers working.
I have managed to install the nvidia daemonsets and i can now see the following pods:
nvidia-driver-installer-gmvgt
nvidia-gpu-device-plugin-lmj84
The problem is that even while using the recommendend image nvidia/cuda:10.0-runtime-ubuntu18.04 i still can't find the nvidia drivers inside my pod:
root#pod-name-5f6f776c77-87qgq:/app# ls /usr/local/
bin cuda cuda-10.0 etc games include lib man sbin share src
But the tutorial mention:
CUDA libraries and debug utilities are made available inside the container at /usr/local/nvidia/lib64 and /usr/local/nvidia/bin, respectively.
I have also tried to test if cuda was working through torch.cuda.is_available() but i get False as a return value.
Many help in advance for your help
Ok so i finally made nvidia drivers work.
It is mandatory to set a ressource limit to access the nvidia driver, which is weird considering either way my pod was on the right node with the nvidia drivers installed..
This made the nvidia folder accessible, but im'still unable to make the cuda install work with pytorch 1.3.0 .. [ issue here ]