tensorflow-gpu missing libraries dlerror - Import error - tensorflow
I am trying to use Tensorflow-gpu for the first time on HPC cluster. I have some main errors in terms of the lack of libraries that don't let me use the GPU.
2020-11-22 14:19:26.629817: W tensorflow/stream_executor/platform/default/dso_lo ader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcuda rt.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRAR Y_PATH: /opt/R/3.5.1/lib64/R/lib:/opt/cluster/lib:/opt/cluster/external/p7zip-16 .02/lib/p7zip
2020-11-22 14:19:26.629870: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-11-22 14:19:30.479705: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-11-22 14:19:31.048853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:82:00.0 name: Tesla P100-PCIE-16GB computeCapability: 6.0
coreClock: 1.3285GHz coreCount: 56 deviceMemorySize: 15.90GiB deviceMemoryBandwidth: 681.88GiB/s
2020-11-22 14:19:31.049038: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/R/3.5.1/lib64/R/lib:/opt/cluster/lib:/opt/cluster/external/p7zip-16.02/lib/p7zip
2020-11-22 14:19:31.049540: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/R/3.5.1/lib64/R/lib:/opt/cluster/lib:/opt/cluster/external/p7zip-16.02/lib/p7zip
2020-11-22 14:19:31.049988: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/R/3.5.1/lib64/R/lib:/opt/cluster/lib:/opt/cluster/external/p7zip-16.02/lib/p7zip
2020-11-22 14:19:31.050412: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/R/3.5.1/lib64/R/lib:/opt/cluster/lib:/opt/cluster/external/p7zip-16.02/lib/p7zip
2020-11-22 14:19:31.050833: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/R/3.5.1/lib64/R/lib:/opt/cluster/lib:/opt/cluster/external/p7zip-16.02/lib/p7zip
2020-11-22 14:19:31.051262: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcusparse.so.10'; dlerror: libcusparse.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/R/3.5.1/lib64/R/lib:/opt/cluster/lib:/opt/cluster/external/p7zip-16.02/lib/p7zip
2020-11-22 14:19:31.539912: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-11-22 14:19:31.539974: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1753] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Num GPUs Available: 0
By using "nvcc -version" I have:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
cudatoolkit version: 9.0 and cudnn: 7.6.5, tf: 2.3.1
I looked up online and found some similar errors, but, solutions did not work in my case. can you please help me?
As suggested by #Robert Crovella and As per the Tensorflow website, Tensorflow 2.3.1 version requires CUDA 10.1.
And, Error also says Could not load dynamic library 'libcudart.so.10.1'.
Related
Time consuming Tensorflow's CUDA driver check in AWS Lambda
I've been running an AWS Lambda and mounted an EFS, where I've installed Tensorflow 2.4. When I try to run the Lambda (and every Lambda that uses Tensorflow 2.4) it wastes a lot of time (about 4 minutes, or maybe more sometimes) on some Tensorflow's settings check. So I need to set a very wide timeout to overcome this issue. These are the prints that the Lambda produces: 2022-05-17 06:33:21.917336: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2022-05-17 06:33:21.921992: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /var/lang/lib:/lib64:/usr/lib64:/var/runtime:/var/runtime/lib:/var/task:/var/task/lib:/opt/lib 2022-05-17 06:33:21.922025: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303) 2022-05-17 06:33:21.922048: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (169.254.137.137): /proc/driver/nvidia/version does not exist 2022-05-17 06:33:21.922460: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set 2022-05-17 06:33:22.339905: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2) 2022-05-17 06:33:22.340468: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2500010000 Hz [WARNING] 2022-05-17T06:33:22.436Z c4500036-5b77-4808-a062-f8ae820b0317 AutoGraph could not transform <function Model.make_predict_function..predict_function at 0x7f65bfb37280> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: unsupported operand type(s) for -: 'NoneType' and 'int' To silence this warning, decorate the function with #tf.autograph.experimental.do_not_convert What I need is to overcome this waste of time, and run a clean elaboration.
tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found
I was instaling tensorflow on my cpu when I got these 2 errors: 2022-03-13 17:59:56.171741: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'cudart64_110.dll'; dlerror: cudart64_110.dll not found 2022-03-13 17:59:56.171872: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Can anybody help me out here a little bit because I was also following a tutorial from a few years ago.
This is just a Warning and Information message that CUDA libraries cannot be found. The I message at line 2: ignore the W message that comes above it if no CUDA GPU is installed on your machine. The only effect of this is that training will happen on CPU only. If you are using NVIDIA GPU, you can refer to how to install the missing files. If you don't use NVIDIA GPU, or simply want to ignore the I and W messages, you can add the 2 lines below at the beginning of your code: import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' You can see more about TF_CPP_MIN_LOG_LEVEL at TensorFlow logging.
Process finished with exit code -1073740791 (0xC0000409) tensorflow-gpu
Here is the code I am using: import math import pandas_datareader as web import numpy as np import pandas as pd from sklearn.preprocessing import MinMaxScaler from keras.models import Sequential from keras.layers import Dense, LSTM import matplotlib.pyplot as plt Stock = 'BTC-USD' #Get the stock quote df = web.DataReader(Stock, data_source='yahoo', start='2016-01-01', end='2020-12-17') #Show the Data #print(df) #Get the number of rows and columns in the data set #print(df.shape) #Visualize the closing price history plt.figure(figsize=(16,8)) plt.title("Close Price History") plt.plot(df['Close']) plt.xlabel('Data', fontsize=18) plt.ylabel('Close Price USD ($)', fontsize=18) plt.show() print(1) #Create a new Dataframe with only the 'close column' data = df.filter(['Close']) #Convert the dataframe to a numpy array dataset = data.values #Get the number of rows to train the model on training_data_len = math.ceil(len(dataset) * .8) print(2) #print(training_data_len) #Scale the data scaler = MinMaxScaler(feature_range=(0,1)) scaled_data = scaler.fit_transform(dataset) print(3) #print(scaled_data) #Create the training data set #Create the scaled training data set train_data = scaled_data[0:training_data_len, :] #Split the data into x_train and y_train data sets x_train = [] y_train = [] print(4) for i in range(60, len(train_data)): x_train.append(train_data[i-60:i, 0]) y_train.append(train_data[i, 0]) if i<= 61: print(x_train) print(y_train) print() print(5) #Convert the x_train and y_train to numpy arrays x_train, y_train = np.array(x_train), np.array(y_train) print(6) #Reshape the x_train data set x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1)) #print(x_train.shape) print(7) #Build the LSTM model model = Sequential() model.add(LSTM(50, return_sequences=True, input_shape=(x_train.shape[1], 1))) model.add(LSTM(50, return_sequences=False)) model.add(Dense(25)) model.add(Dense(1)) print(8) #Compile the model model.compile(optimizer='adam', loss='mean_squared_error') print(9) #Train the model model.fit(x_train, y_train, batch_size=1, epochs=1) print(10) #Create the test data set #Create a new array containing scaled values from index 1543 to 2003 test_data = scaled_data[training_data_len - 60: 2003] #Create the data sets x_test and y_test x_test = [] y_test = dataset[training_data_len:, :] for i in range(60, len(test_data)): x_test.append(test_data[i-60:i, 0]) print(11) #Convert the data to a numpy array x_test = np.array(x_test) #print(x_test.shape) #Reshape the data x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1)) #Get the models predicted price values predictions = model.predict(x_test) predictions = scaler.inverse_transform(predictions) #Evaluate the model. Get the root mean squared error (RMSE) rmse = np.sqrt(np.mean(predictions - y_test)**2) print(rmse) #Plot the data train = data[:training_data_len] valid = data[training_data_len: ] valid['Predictions'] = predictions #Visualize the data plt.figure(figsize=(16,8)) plt.title('Model') plt.xlabel('Date', fontsize=18) plt.ylabel('Close Price USD ($)', fontsize=18) plt.plot(train['Close']) plt.plot(valid[['Close', 'Predictions']]) plt.legend(['Train', 'Val', 'Predictions'], loc='lower right') plt.show() #Show the valid and predicted prices print(valid) It stops between 9 and 10 if you look at my "print" debugging method and this is the output: C:\Users\gunne\anaconda3\envs\EnvBioWell\python.exe C:/Users/gunne/PycharmProjects/pythonProject/Stocks_Price.py 2021-02-19 09:57:52.322041: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll 1 2 3 4 [array([0.00307386, 0.00303452, 0.00288404, 0.00301928, 0.00296962, 0.00284426, 0.00411515, 0.00390359, 0.00365686, 0.00367355, 0.00369274, 0.00313341, 0.00298767, 0.00289699, 0. , 0.00101894, 0.00078898, 0.00100278, 0.00069457, 0.00245455, 0.00201685, 0.00079746, 0.00101697, 0.0016967 , 0.00120293, 0.00122168, 0.00134546, 0.00070072, 0.00066494, 0.00061141, 0.00019479, 0.00038312, 0.00044424, 0.00024669, 0.00110931, 0.0009756 , 0.00053531, 0.00053962, 0.00040029, 0.00051366, 0.00076044, 0.00067284, 0.00087522, 0.00120881, 0.00188371, 0.00157436, 0.00189504, 0.00228295, 0.00254865, 0.00247892, 0.00319813, 0.00326988, 0.00322377, 0.00247677, 0.00266203, 0.00264398, 0.00297805, 0.00299417, 0.00303742, 0.00322153])] [0.003108507179665692] [array([0.00307386, 0.00303452, 0.00288404, 0.00301928, 0.00296962, 0.00284426, 0.00411515, 0.00390359, 0.00365686, 0.00367355, 0.00369274, 0.00313341, 0.00298767, 0.00289699, 0. , 0.00101894, 0.00078898, 0.00100278, 0.00069457, 0.00245455, 0.00201685, 0.00079746, 0.00101697, 0.0016967 , 0.00120293, 0.00122168, 0.00134546, 0.00070072, 0.00066494, 0.00061141, 0.00019479, 0.00038312, 0.00044424, 0.00024669, 0.00110931, 0.0009756 , 0.00053531, 0.00053962, 0.00040029, 0.00051366, 0.00076044, 0.00067284, 0.00087522, 0.00120881, 0.00188371, 0.00157436, 0.00189504, 0.00228295, 0.00254865, 0.00247892, 0.00319813, 0.00326988, 0.00322377, 0.00247677, 0.00266203, 0.00264398, 0.00297805, 0.00299417, 0.00303742, 0.00322153]), array([0.00303452, 0.00288404, 0.00301928, 0.00296962, 0.00284426, 0.00411515, 0.00390359, 0.00365686, 0.00367355, 0.00369274, 0.00313341, 0.00298767, 0.00289699, 0. , 0.00101894, 0.00078898, 0.00100278, 0.00069457, 0.00245455, 0.00201685, 0.00079746, 0.00101697, 0.0016967 , 0.00120293, 0.00122168, 0.00134546, 0.00070072, 0.00066494, 0.00061141, 0.00019479, 0.00038312, 0.00044424, 0.00024669, 0.00110931, 0.0009756 , 0.00053531, 0.00053962, 0.00040029, 0.00051366, 0.00076044, 0.00067284, 0.00087522, 0.00120881, 0.00188371, 0.00157436, 0.00189504, 0.00228295, 0.00254865, 0.00247892, 0.00319813, 0.00326988, 0.00322377, 0.00247677, 0.00266203, 0.00264398, 0.00297805, 0.00299417, 0.00303742, 0.00322153, 0.00310851])] [0.003108507179665692, 0.0026196096172032522] 5 6 7 2021-02-19 09:57:55.479350: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-02-19 09:57:55.479830: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll 2021-02-19 09:57:55.506019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:0e:00.0 name: GeForce RTX 3080 computeCapability: 8.6 coreClock: 1.74GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s 2021-02-19 09:57:55.506184: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll 2021-02-19 09:57:55.509645: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll 2021-02-19 09:57:55.509728: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll 2021-02-19 09:57:55.511906: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll 2021-02-19 09:57:55.512542: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll 2021-02-19 09:57:55.516144: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll 2021-02-19 09:57:55.517468: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll 2021-02-19 09:57:55.518014: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll 2021-02-19 09:57:55.518133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2021-02-19 09:57:55.518454: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-02-19 09:57:55.519487: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:0e:00.0 name: GeForce RTX 3080 computeCapability: 8.6 coreClock: 1.74GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s 2021-02-19 09:57:55.519678: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll 2021-02-19 09:57:55.519763: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll 2021-02-19 09:57:55.519836: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll 2021-02-19 09:57:55.519915: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll 2021-02-19 09:57:55.519995: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll 2021-02-19 09:57:55.520066: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll 2021-02-19 09:57:55.520138: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll 2021-02-19 09:57:55.520212: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll 2021-02-19 09:57:55.520304: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2021-02-19 09:57:56.033929: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-02-19 09:57:56.034021: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0 2021-02-19 09:57:56.034068: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N 2021-02-19 09:57:56.034278: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8444 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3080, pci bus id: 0000:0e:00.0, compute capability: 8.6) 2021-02-19 09:57:56.035130: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set 8 9 2021-02-19 09:57:56.610845: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2) 2021-02-19 09:57:57.789418: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll 2021-02-19 09:57:58.358210: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll 2021-02-19 09:57:58.377765: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll Process finished with exit code -1073740791 (0xC0000409) I'm using my RTX 3080 GPU with Cuddn 8.0 and Cuda Toolkit 11.0 and tensorflow-gpu 2.4.0 with python 3.7. Please let me know if anyone has any suggestions! I've tried everything I can think of and when I run code to check for GPU it pulls it up, so it should be working, but I just can't get it to. I had it working a month ago before I messed something up and now it shows all of those time stamps in the output with packages it's opening. Not sure why, but I just need help.
Have you tried this link: https://www.reddit.com/r/tensorflow/comments/jsalkw/rtx_3090_and_tensorflow_for_windows_10_step_by/ Or have you solved it already? Here is the content of the link: The NVIDIA 3000 Series GPUs (Ampere) require CUDA v11 and cuDNN v8 to work. The tensorflow versions on anaconda and pip on Windows (currently at max tensorflow 2.3) do not include a tensorflow built with CUDA v11. But you can use pip to install a nightly build of tensorflow (currently tensorflow 2.5) which built with CUDA v11. Apart from a tensorflow build with CUDA v11, you will also need the actual DLLs for CUDA v11 and cuDNN v8. Normally, you would just install these with anaconda with the packages cudatoolkit and cudnn, but while cudatoolkit is available with v11, for cudnn, at least for Windows, v8 is not available in anaconda. The workaround is to manually get these DLLs and set them in the system environment path (so that python/tensorflow can find and load them). So let's start: First, install anaconda if you haven't already. Open the anaconda prompt with admin rights. Type conda create -n tf2 python=3.8 and hit enter to create a new anaconda environment with python 3.8 (the tensorflow nightly build needs python 3.8 or higher, that's why we are using python 3.8) Type activate tf2 or conda activate tf2 and hit enter to enter that new environment. Install the nightly tensorflow build with pip3 install tf-nightly-gpu Install other packages that you might need. For me, it's conda install jupyter scikit-learn matplotlib pandas Now, download CUDA v11 from NVIDIA (https://developer.nvidia.com/cuda-downloads or https://developer.nvidia.com/cuda-toolkit-archive ). Yeah, the file is pretty big with 3GB. Additionally, apparently we also need a Microsoft Visual Studio version for C++ for the installer to run properly. Download the free Visual Studio Community Edition (https://visualstudio.microsoft.com/downloads/ ) and install the C++ components. For this, select "Desktop development with C++", select the first 6 options and install. This step is taken from the guide I mentioned earlier , so refer to it if you have trouble with this. For me, I already had Visual Studio with C++ in mind set up on my computer, so I could skip this step. Now, let's first execute the CUDA v11 installer. Execute it. You can do the express installation, but if you already have GeForce Experience installed, you can also choose the Custom option and deselect everything that you already have installed with a higher version. For me, I only needed the very first checkbox with the CUDA options, so that might be enough. What the CUDA v11 installer basically did was installing all the CUDA v11 DLLs, Headers, and stuff in the directory "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1" (the version may be different for you). What we will do next: Add the cuDNN DLLs, Headers, etc. in this directory as well and then add this directory to the system path. Ok, let's go. Download cuDNN from NVIDIA (https://developer.nvidia.com/rdp/cudnn-download ). This file is around 700MB. You need to register as a developer and answer some questions, but don't worry, it's free. When asked for an email, you can type in any email, since in the next page, you will get an option to login using google or facebook as an alternative (which you may or may not prefer). Once you downloaded the file, extract it. Going into the directory, you will see three folders "bin", "include", "lib". Comparing it with the CUDA v11 directory (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1), you'll notice that these directories are present there as well! So just copy the folders from cuDNN to the CUDA v11 directory. Windows will add the files into the existing folders. Now, let's add those directories to the system path. In windows, open start and search for "This PC". Rightclick and select "Properties" to open a Window called "System". On the left side at the bottom, select "Advanced system settings". Click "Environment Variables..." at the bottom. Here, in the lower half, in "System variables", find and open "Path". Here, click "New" to add a new directory to the system path. Do this every time for each of the following directories (as mentioned earlier, the version number may be different for you). Some of the directories may be already listed there, so feel free to skip them (there is no negative effect from double entries though, so don't worry too much): C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\libnvvp C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\extras\CUPTI\lib64 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include Now, very important: Restart your system! Now, run your code to see if everything works. For me, it was through a jupyter notebook. A simple thing to do first is to import tensorflow and check the physical devices:import tensorflow as tftf.config.list_physical_devices() Your GPU may not show up. Take a close look at the output of the console (for me, it was the anaconda prompt with which I started up my jupyter notebook). There, you should see logs like tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll or a similar log stating that a certain DLL could not be loaded! In my case, everything loaded except the DLL "cusolver64_10.dll". So, I went to the CUDA v11 directory (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1), opened the "bin" folder (the DLLs are in there) to check if that DLL was there. Nope, it was not. Instead, there was "cusolver64_11.dll". So what I did was just copy that DLL and renamed the copy to "cusolver64_10.dll". Yeah, sounds dumb, but after that, everything worked.
Object Detection Performance Issues Using Tensorflow 2.1.0 and Tensorflow Hub
Running through some of the object detection documentation and examples found online utilizing the OpenImagesV4 Data Model I am experiencing less than favorable performance on the processing speed of the detection events. The code I am using is as follows and is a stripped down version of the detection so I can understand the performance metrics. The Camera Stream Processes Fine without using any detection, Once detection is implemented it slows the feed down by roughly 20 seconds or so. I have seen this done in TF1.14 using the old object detection with tf.graph() functions with near zero-delay on a different model so my question is really where can more performance be made for the feed stream or where are my hang-ups at with this stripped down version. This is using the gpu for processing but only seeing spikes at ~6%. My original thought was to introduce threading on the Detection process but I am not sure how to go about doing that or if it is necessary Software Tensorflow version (2.1.0) Cuda 10.1 cudnn 7 Hardware CPU: Intel i7-4820K GPU: Geforce GTX 1660 (6GB) Memory: 16GB import cv2 import time import gc from datetime import datetime import tensorflow as tf import tensorflow_hub as hub low_res_vid_source = "http://192.168.1.85:14238/videostream.cgi?loginuse=####&loginpas=######" hi_res_vid_source = "rtsp://####:#####192.168.1.85:10554/tcp/av0_0" cap = cv2.VideoCapture(low_res_vid_source) #Low Res (640): Hi Res (1280) width = cap.get(3) #Low Res (480): Hi Res (720) height = cap.get(4) print("Dimensions: Width: ", width, "Height: ", height) #Remote Loading #module_handle = "https://tfhub.dev/google/faster_rcnn/openimages_v4/inception_resnet_v2/1" #Local Loading module_handle = "C://Users//Isaiah//tf2//Tutorial Sets//Expert//HubCache//ddd04e3eaa283f2b3ae566e084863074d12b403a" detector = hub.load(module_handle).signatures['default'] def LoadStream(): ret, frame = cap.read() image_resize_val = (1280, 720) frame = cv2.resize(frame, image_resize_val) ## Average Calculation Time of Conversion Of Pixel Normalization = 0.018950 Seconds frame = frame / 255 ## Average Calculation Time of Conversion Of Image Data Type = 0.001999 Seconds converted_img = tf.image.convert_image_dtype(frame, tf.float32)[tf.newaxis, ...] ## Average Calculation Time of Loading Results From Detector = 1.7 Seconds time_start = time.time() results = detector(converted_img) time_end = time.time() print("Detection Took: ", time_end - time_start) cv2.imshow('camera feed', frame) while True: LoadStream() if cv2.waitKey(1) & 0xFF == ord('q'): cv2.destroyAllWindows() break Output From the Conda Environment for this code is as follows and nothing seems to be really sticking out (tf2-gpu) C:\Users\Isaiah\tf2\Tutorial Sets\Expert\Camera_Feed>python Camera_Feed_Raw.py 2020-05-03 16:52:36.567941: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll Dimensions: Width: 640.0 Height: 360.0 2020-05-03 16:54:52.037826: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2020-05-03 16:54:52.253465: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: pciBusID: 0000:03:00.0 name: GeForce GTX 1660 computeCapability: 7.5 coreClock: 1.815GHz coreCount: 22 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.86GiB/s 2020-05-03 16:54:52.260714: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll 2020-05-03 16:54:52.272442: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll 2020-05-03 16:54:52.282134: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll 2020-05-03 16:54:52.287729: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll 2020-05-03 16:54:52.300130: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll 2020-05-03 16:54:52.307647: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll 2020-05-03 16:54:52.326362: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2020-05-03 16:54:52.331006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0 2020-05-03 16:54:52.334046: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX 2020-05-03 16:54:52.626783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: pciBusID: 0000:03:00.0 name: GeForce GTX 1660 computeCapability: 7.5 coreClock: 1.815GHz coreCount: 22 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.86GiB/s 2020-05-03 16:54:52.633826: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll 2020-05-03 16:54:52.638740: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll 2020-05-03 16:54:52.642777: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll 2020-05-03 16:54:52.647763: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll 2020-05-03 16:54:52.651710: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll 2020-05-03 16:54:52.656789: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll 2020-05-03 16:54:52.660852: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2020-05-03 16:54:52.667018: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0 2020-05-03 16:54:53.626966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-05-03 16:54:53.630823: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0 2020-05-03 16:54:53.633295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N 2020-05-03 16:54:53.638096: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4630 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660, pci bus id: 0000:03:00.0, compute capability: 7.5) 2020-05-03 16:57:25.429470: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll 2020-05-03 16:57:26.697611: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2020-05-03 16:57:29.627538: W tensorflow/stream_executor/gpu/redzone_allocator.cc:312] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only Relying on driver to perform ptx compilation. This message will be only logged once. Detection Took: 58.80091857910156 Detection Took: 1.747373104095459 Detection Took: 1.7253808975219727 Detection Took: 1.736377477645874 Detection Took: 1.7273805141448975 Detection Took: 1.7343783378601074 Detection Took: 1.742375373840332 Detection Took: 1.7413759231567383 Detection Took: 1.7293803691864014 Detection Took: 1.7283804416656494 Detection Took: 1.7403762340545654 Detection Took: 1.7323787212371826 Detection Took: 1.7373778820037842 Detection Took: 1.7323782444000244
While importing tensorflow I got an "errror" saying the following
2019-11-07 00:41:30.414603: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll Is this an error or is it normal? I saw 'successfully' so I'm thinking its good but is it?