How Tensorflow use cudnn - tensorflow

I'm reading Tensorflow source code currently, and curious about the implementation of kernels. I found that most of the gpu implementation pointing to Eigen. Could anyone can tell me how tensorflow use cuDNN via Eigen, or something else?

Yes, most basic kernels use Eigen which uses plain CUDA. Kernels that use cuDNN (e.g. convolution) go through this integration: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/stream_executor/cuda
Here is an example Conv kernel that retrieves supported Conv algorithms (including cuDNN if it is linked and available), run and chooses the best one, and finally, uses it.

Related

How to Use Build a Keras (TF) model using GPU?

The question is pretty straightforward but nothing has really been answered.
Pretty simple, how do I know that when I build a Sequential() model in tensorflow via Keras it's going to use my GPU?
Normally, in Torch, so easy just use 'device' parameter and can verify via nvidia-smi volatility metric. I tried it while building model in TF but nvidia-smi shows 0% usage across all GPU devices.
Tensorflow uses GPU for most of the operations by default when
It detects at least one GPU
Its GPU support is installed and configured properly. For information regarding how to install and configure it properly for GPU support: https://www.tensorflow.org/install/gpu
One of the requirements to emphasize is that specific version of CUDA library has to be installed. e.g. Tensorflow 2.5 requires CUDA 11.2. Check here for the CUDA version required for each version of TF.
To know whether it detects GPU devices:
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
It will also print out debug messages by default to stderr to indicate whether the GPU support is configured properly and whether it detects GPU devices.
To validate using nvidia-smi that it is really using GPU:
You have to define a sufficiently deep and complex neural network model such that the bottleneck is in the GPU side. This can be achieved by increasing the number of layers and the number of channels in each of the layers.
When doing training or inference of the model like model.fit() and model.evaluate(), the GPU utilization in the logging from nvidia-smi should be high.
To know exactly where each operation will be executed, you can add the following line in the beginning of your codes
tf.debugging.set_log_device_placement(True)
For more information: https://www.tensorflow.org/guide/gpu

Can I use TensorFlow GPU version without cuDNN, and how?

I am using TensorFlow1.13 GPU version (with cuda), and I do not want to use cuDNN to do the convolution due to some reasons. Anyone know how to do that plz?
You can't use tensorflow GPU without CUDA because Keras is based on it. Tensorflow packs another library inside it that is called Keras which uses this dependency.

How can I check if keras/tensorflow is using cuDNN?

I have installed CUDA and cuDNN, but the last was not working, giving a lot of error messages in theano. Now I am training moderate sized deep conv nets in Keras/Tensorflow, without getting any cuDNN error messages. How can I check if cuDNN is now being used?
tl;dr: If tensorflow-gpu works, then CuDNN is used.
The prebuilt binaries of TensorFlow (at least since version 1.3) link to the CuDNN library. If CuDNN is missing, an error message will tell you ImportError: Could not find 'cudnn64_7.dll'. TensorFlow requires that this DLL be installed....
According to the TensorFlow install documentation for version 1.5, CuDNN must be installed for GPU support even if you build it from source. There are still a lot of fallbacks in the TensorFlow code for the case of CuDNN not being available -- as far as I can tell it used to be optional in prior versions.
Here are two lines from the TensorFlow source that explicitly tell and force that CuDNN is required for gpu acceleration.
There is a special GPU version of TensorFlow that needs to be installed in order to use the GPU (and CuDNN). Make sure the installed python package is tensorflow-gpu and not just tensorflow.
You can list the packages containing "tensorflow" with conda list tensorflow (or just pip list, if you do not use anaconda), but make sure you have the right environment activated.
When you run your scripts with GPU support, they will start like this:
Using TensorFlow backend.
2018- ... C:\tf_jenkins\...\gpu\gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7845
To test it, just type into the console:
import tensorflow as tf
tf.Session()
To check if you "see" the CuDNN from your python environment and therewith validate a correct PATH variable, you can try this:
import ctypes
ctypes.WinDLL("cudnn64_7.dll") # use the file name of your cudnn version here.
You might also want to look into the GPU optimized Keras Layers.
CuDNNLSTM
CuDNNGRU
They are significantly faster:
https://keras.io/layers/recurrent/#cudnnlstm
We saw a 10x improvement going from the LSTM to CuDNNLSTM Keras layers.
Note:
We also saw a 10x increase in VMS (virtual memory) usage on the machine. So there are tradeoffs to consider.

How to integrate tensorflow into QNX operating system

I want to use the tensorflow in a QNX operating system? The very first step is to integrate the tensorflow into QNX. Any suggestions?
There is an issue on that on GitHub, unfortunately w/o a result but it's a starting point: https://github.com/tensorflow/tensorflow/issues/14753
Depending on your objective, NVIDIA's TensorRT can load TensorFlow models and provides binaries for QNX, see for example https://docs.nvidia.com/deeplearning/sdk/pdf/TensorRT-Release-Notes.pdf

how to use CUDA Pinned memory from tensorflow python library

I am currently looking into running tensorflow on GPU. I was reading about CUDA pinned memory.
I was not able to find anyway to set this when using tensorflow python library.
Any idea how it can be done?
It's automatically used for any memcpy between CPU and GPU. If you want more sophisticated functionalities, you can write a kernel that explicitly uses them.