Tensorflow see only XLA_GPU and not GPU - tensorflow

I have a problem since few days.
I installed NVIDIA drivers and cuDNN using some step by steph tutoriaI from the Internet.
The installation succeded since tests on CUDA samples passed.
I then installed python 3.7, jupyter and Tensorflow-gpu.
However, Tensorflow don't see my 2 GPUs and see only XLA_GPUs.
I tried recommandations from other posts (such us uninstalling and installing tensorflow) but this did not solve my problem.
Anyone have an idea how to solve this problem ?

Related

Are there problems with Tensorflow and Keras in Mac version 12.2 (OS Monterey)

I bought a new Mac version 12.2 (OS Monterey) and installed Anaconda. Most of the Python packages can be installed correctly. However, Tensorflow (ver 2.8.0) and Keras (ver 2.8.0) have major issues. The Jupyter notebook kernel gets killed when tensorflow and/or keras get imported. I looked up various posts on Stackoverflow and Medium, however, nothing seems helpful. I even tried to convert the .ipynb to .py script, however, the same error occurs.
Is there anything that can be done to resolve this?

is CUDA 11 with RTX 3080 support tensorflow and keras?

I attached RTX 3080 to my computer. but when training on keras 2.3.1 and tensorflow 1.15, I got some error "failed to run cuBLAS_STATUS_EXECUTION_FAILED, did not mem zero GPU location . . . check failed:start_event !=nullptr && stop_event != nullptr" I think the problem is that recently released rtx 3080 and CUDA 11 is not yet support the keras 2.xx and tensorflow 1.xx. is this right? And what make that problem?
At the moment of writing this, currently Nvidia 30xx series only fully support CUDA version 11.x, see https://forums.developer.nvidia.com/t/can-rtx-3080-support-cuda-10-1/155849/2
Tensorflow 1.15 wasn't fully supported on CUDA since version 10.1 and newer, for probably similar reason as described in the link above. Unfortunately TensorFlow version 1.x is no longer supported or maintained, see https://github.com/tensorflow/tensorflow/issues/43629#issuecomment-700709796
TensorFlow 2.4 is your best bet with an Ampere GPU. It has now a stable release, and it has official support for CUDA 11.0, see https://www.tensorflow.org/install/source#gpu
As TensorFlow 1.x is never going to be updated or maintained by TensorFlow team, I would strongly suggest moving to TensorFlow 2.x, excluding personal preferences, it's better in almost every way and has tf.compat module for backwards compatibility with TensorFlow 1.x code, if rewriting you code base is not an option. However, even that module is no longer maintained, really showing that version 1.x is dead, see https://www.tensorflow.org/guide/versions#what_is_covered
However, if you're dead set on using TensorFlow 1.15, you might have a chance with Nvidia Tensorflow, which apparently has support for version 1.15 on Ampere GPUs, see https://developer.nvidia.com/blog/accelerating-tensorflow-on-a100-gpus/

Tensorflow does not generate GPU tracing information

I started a new machine learning project.
In according to this document (https://www.tensorflow.org/tensorboard/tensorboard_profiling_keras)
TF with Tensorboard appears to support GPU profiling. So, i used the same code in my Jupyter Notebook for testing.
The sample code generates profiling resulting. However, there is no GPU tracing information in resulting file. (only CPU)
This is my main problem.
I am using two RTX 2080 TI graphic cards.
And also, they were working when running the code.
The sample code does not use MirroredStrategy. So, i could see the one of them was running.
At first, i thought Tensorboard was the problem. But,i realized soon that TF does not generate the GPU tracing information.
The image above is the resulting file (local.trace). There was no GPU data.
It is my system specification.
OS ubuntu 18.04
jupyter-client 5.3.4
jupyter-core 4.6.1
jupyter-tensorboard 0.1.10
tensorflow-gpu 2.0.0
tensorflow-estimator 2.0.1
tensorflow-metadata 0.15.1
tensorboard 2.0.2
nVidia 410.104
CUDA 10.0
anaconda 4.7.12 (with python 3.6)
It looks irrelevant, but there was a warning message like the image below.
I have tested this on other PC and got the same resulting. It could be the GPU profiling is only supporting on Google Colab. (I am still confusing) Recently, I have searched it on google to fix the problem. I could not get still the answer.
Is there someone who is using GPU profiling on your own System instead of Google Colab?
Please give me piece of advices.
I figured out what caused the problem.
It was related with CUPTI(CUDA Profiling Tools Interface)
In contrast to Jupyter Notebook, there was a warning message when the code is running on Ubunto shell.
CUPTI error: CUPTI could not be loaded or symbol could not be found.
TF could not find CUPTI libraries. This is the main reason of the problem.
After adding the path to LD_LABRARY_PATH as below link, the problem is fixed!
https://stackoverflow.com/a/58752904/5553618

Why has gpu stopped working for me in google colab?

I am a university professor trying to learn deep learning for a possible class in the future. I have been using google colab with GPU support for the past couple of months. Just recently, the GPU device is not found. But, I am doing everything that I have done in the past. I can't imagine that I have done anything wrong because I am just working through tutorials from books and the tensorflow 2.0 tutorials site.
tensorflow 2 on Colab GPU was broken recently due to an upgrade from CUDA 10.0 to CUDA 10.1. As of this afternoon, the issue should be resolved for the tensorflow builds bundled with Colab. That is, if you run the following magic command:
%tensorflow_version 2.x
then import tensorflow will import a working, GPU-compatible tensorflow 2.0 version.
Note, however, if you attempt to install a version of tensorflow using pip install tensorflow-gpu or similar, the result may not work in Colab due to system incompatibilities.
See https://colab.research.google.com/notebooks/tensorflow_version.ipynb for more information.

Is it time saving for loading a saved tensorflow model

The question is,I cannot make my computer work for my tensorflow-gpu on ubuntu system. Because NVIDIA driver cannot be installed on ubuntu.So I run tensorflow-gpu on Windows10,but it doesnot support tensorflow-serving.
I know Docker can help me to do it,and i really installed it,but just tensorflow-cpu.That would be very slowly if I just run tensorflow-cpu version.
In case that,I came up with a thought that I install two tensorflow,one is GPU version and on system,the other is CPU version on Docker.GPU version for training and save a model,then CPU version loading the saved model.
What I want to know is does this way work,and is it time saving?Or put it simply,does it take less time than just run tensorflow-cpu version on Docker?
TensorFlow GPU with NVIDIA GPUs on Ubuntu is supported, and there are drivers available. Check this tutorial.