Tensorflow 2.2 taking a long time to start

Tensorflow 2.2 taking a long time to start - tensorflow

I am trying to run tensorflow on windows 10 with the following setup:
Anaconda3 with
python 3.8
tensorflow 2.2.0
GPU: RTX3090
cuda_10.1.243
cudnn-v7.6.5.32 for windows10-x64
Running the following code takes between 5 ~ 10 minutes to print the output.
import tensorflow as tf
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
I get the following output immediately, but then it hangs for few minutes before proceeding.
1-17 04:03:00.039069: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-11-17 04:03:00.042677: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-11-17 04:03:00.045041: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-11-17 04:03:00.045775: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-11-17 04:03:00.049246: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-11-17 04:03:00.050633: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-11-17 04:03:00.056731: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-11-17 04:03:00.056821: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
Running the smae code on colab takes only a second.
Any suggestions?
Thanks

I don't understand why Mux's answer is downvoted, as he is right. Nvidia Ampere can't run optimally on CUDA versions < 11.1, as Ampere streaming multiprocessor (SM_86) are only supported on CUDA 11.1, see https://forums.developer.nvidia.com/t/can-rtx-3080-support-cuda-10-1/155849/2
However, the direct solution to your issue without updating CUDA could possibly be achieved by increasing default JIT cache size with 'export CUDA_CACHE_MAXSIZE=2147483648', by setting that environment variable to 2147483648 (4GB). You will still have this long wait on first start up thought, see https://www.tensorflow.org/install/gpu#hardware_requirements

RTX3090 has Amper Architecture which requires Cuda 11+.
Checkout this guide:
https://medium.com/#dun.chwong/the-simple-guide-deep-learning-with-rtx-3090-cuda-cudnn-tensorflow-keras-pytorch-e88a2a8249bc

The reason is as Mux says.
Background:
See https://developer.nvidia.com/blog/cuda-pro-tip-understand-fat-binaries-jit-caching/ for full explanation.
The first stage compiles source device code to PTX virtual assembly, and the second stage compiles the PTX to binary code for the target architecture. The CUDA driver can execute the second stage compilation at run time, compiling the PTX virtual assembly “Just In Time” to run it.
So for old version software package with new hardware, that is binary code for target architecture is not precompiled, it fallbacks to PTX virtual assembly and trigger runtime JIT compile for the new target architecture. That mean CUDNN and CUBLAS kernels and tensorflow built-in kernels are all JIT compiled at startup, which incurs loooooog startup time in your case.
That is why Dan Pavlov suggests enables JIT caching, that is, you only JIT compile once, not JIT compile from time to time on startup.

Related

Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found

I recently installed Tensorflow-gpu and 2 errors popped up.
Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Thing is I want to use my GPU so I am not sure if this will affect it.
I installed cudnn yet still nothing. I installed VS code 2017 and installed c++ with it yet I still get this error. I am using pycharm as my interpreter.

Unable to get tensorflow to recognize my GPU on Windows 10

I had previously installed the tensorflow-gpu 2.5.0 package using conda on my machine in 2 envrionments. In a python 3.7 environment, everything worked well. In a python 3.8 environment, I was having issues, but I found that if I loaded certain libraries first, I was able to get tensorflow to recognize my GPU.
Unfortunately, I ended up with some issues in those environments, and I had to remove them.
Now, no matter how I try, I cannot get tensorflow to load cudart64_110.dll and, therefore, recognize my GPU. I have tried installing using the tensorflow-gpu 2.5.0 package using conda, and I have tried following the instructions on the official tensorflow.org page, but in neither case will tensorflow load the cudart64_110.dll library.
However, I also have spacy installed in my environment, and if I import spacy, I get the following messages:
2022-09-16 13:42:58.588234: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-09-16 13:42:58.588417: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-09-16 13:43:00.551253: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library nvcuda.dll
2022-09-16 13:43:00.555761: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 Laptop GPU computeCapability: 8.6
coreClock: 1.425GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
2022-09-16 13:43:00.556393: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
2022-09-16 13:43:00.556491: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublas64_11.dll
2022-09-16 13:43:00.556787: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublasLt64_11.dll
2022-09-16 13:43:00.557350: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cufft64_10.dll
2022-09-16 13:43:00.557720: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'curand64_10.dll'; dlerror: curand64_10.dll not found
2022-09-16 13:43:00.558789: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusolver64_11.dll
2022-09-16 13:43:00.559043: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusparse64_11.dll
2022-09-16 13:43:00.559383: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudnn64_8.dll
2022-09-16 13:43:00.559746: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1766] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
So, clearly, some of the CUDA libraries are being opened correctly, but not all.
I've tried uninstalling miniconda & reinstalling it, but I get the same issues.
I'm at a loss as to why tensorflow continues to give me these problems.

How do I fix package dependencies when using a different cudatoolkit than my nvidia cluster is using?

I am using a package that requires tensorflow-gpu == 2.0.0 and CUDA=10.0.0 with cudann==7.6.0
I am running this code on a NVIDIA gpu cluster and when I run nvidia-smi it shows
this. It still shows cuda 11, which I guess is the one installed on the actually server.
I was told that I can basically 'override' this version by installing the cudatoolkit in the version that I need. I did that and installed cudatoolkit==10.0.
Unfortunately I am now running into a problem when trying to run an LSTM based model with tensorflow-gpu. I get the following:
2022-06-14 17:02:26.988359: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989175: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2022-06-14 17:02:26.989208: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
In the path I still see cuda11. Is this causing the problem? How can I resolve this?

As you mentioned in the comment you need to use TensorFlow 2.1, then you need to install cuDNN 7.6 and CUDA 10.1 specifically.
Please follow the below tested build configurations to know about Python and TensorFlow versions compatible CUDA and cuDNN .
Please check this link for more details on GPU setup.

Using TensorFlow with GPU taking a long time for loading library related to CUDA

Machine Setting:
GPU: GeForce RTX 3060
Driver Version: 460.73.01
CUDA Driver Veresion: 11.2
Tensorflow: tensorflow-gpu 1.14.0
CUDA Runtime Version: 10.0
cudnn: 7.4.1
Note:
CUDA Runtime and cudnn version fits the guide from Tensorflow official documentation.
I've also tried for TensorFlow-gpu = 2.0, still the same problem.
Problem:
I am using Tensorflow for an objection detection task. My situation is that the program will stuck at
2021-06-05 12:16:54.099778: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10
for several minutes.
And then stuck at next loading process
2021-06-05 12:21:22.212818: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
for even longer time. You may check log.txt for log details.
After waiting for around 30 mins, the program will start to running and WORK WELL.
However, whenever program invoke self.session.run(...), it will load the same two library related to cuda (libcublas and libcudnn) again, which is time-wasted and annoying.
I am confused that where the problem comes from and how to resolve it. Anyone could help?
Discussion Issue on Github
===================================
Update
After #talonmies 's help, the problem was resolved by resetting the environment with correct version matching among GPU, CUDA, cudnn and tensorflow. Now it works smoothly.

Generally, if there are any incompatibility between TF, CUDA and cuDNN version you can observed this behavior.
For GeForce RTX 3060, support starts from CUDA 11.x. Once you upgrade to TF2.4 or TF2.5 your issue will be resolved.
For the benefit of community providing tested built configuration
CUDA Support Matrix

GPU problem for CUDA 11.0 and cuDNN 8.0.2

I have the CUDA 11.0 and cuDNN 8.0.2, which are the recommended setup
I have tensorflow-gpu 2.3 and keras 2.4
However the GPUs are not used and I don't know why.
by giving the following command lines
sess = tf.test.is_gpu_available(cuda_only=False, min_cuda_compute_capability=None)
print("GPU available? ", sess)
built = tf.test.is_built_with_cuda()
print("tf is built with CUDA? ", built)
gpus = tf.config.list_physical_devices('GPU')
cpus = tf.config.list_physical_devices('CPU')
print("Num GPUs used: ", len(gpus))
print("Num CPUs used: ", len(cpus))
print(tf.sysconfig.get_build_info())
The output is the following:
GPU available? False
tf is built with CUDA? True
Num GPUs used: 0
Num CPUs used: 1
{'cuda_version': '10.1', 'cudnn_version': '7', 'cuda_compute_capabilities': ['sm_35', 'sm_37', 'sm_52', 'sm_60', 'sm_61', 'compute_70'], 'cpu_compiler': '/usr/bin/gcc-5', 'is_rocm_build': False, 'is_cuda_build': True}
it comes with the following error:
W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.

As stated in Tensorflow documentation. The software requirements are as follows.
Nvidia gpu drivers - 418.x or higher
Cuda - 10.1 (TensorFlow >= 2.1.0)
cuDNN - 7.6
See Link
You must have a python version between 3.5 - 3.8.
Along with that you need Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019.
You can download that here. Link
Check the full system requirements here Link
Don't forget to add cuda and cudnn in your system path. See Link

This solution clears the issue completely in my case even my system is set up with CUDA 10.2. Tensorflow probably requires something from CUDA 10.1, I guess.
conda install cudatoolkit=10.1
Check https://github.com/tensorflow/tensorflow/issues/38578#issuecomment-710104168

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Tensorflow 2.2 taking a long time to start - tensorflow

RTX3090 has Amper Architecture which requires Cuda 11+. Checkout this guide: https://medium.com/#dun.chwong/the-simple-guide-deep-learning-with-rtx-3090-cuda-cudnn-tensorflow-keras-pytorch-e88a2a8249bc

Related

Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found

Unable to get tensorflow to recognize my GPU on Windows 10

How do I fix package dependencies when using a different cudatoolkit than my nvidia cluster is using?

Using TensorFlow with GPU taking a long time for loading library related to CUDA

GPU problem for CUDA 11.0 and cuDNN 8.0.2

Categories

Resources