How to rebuild the tensorflow with appropriate compiler flags to anable avx and AVX2? - tensorflow

python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
2023-02-03 15:58:32.821503: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
tf.Tensor(-642.5437, shape=(), dtype=float32)
What to do?
According to Object detection API-Insatllation documentation the answer should be like this
2021-06-08 18:28:38.452128: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll
2021-06-08 18:28:40.948968: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library nvcuda.dll
2021-06-08 18:28:40.973992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: GeForce GTX 1070 Ti computeCapability: 6.1
coreClock: 1.683GHz coreCount: 19 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 238.66GiB/s
2021-06-08 18:28:40.974115: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll
2021-06-08 18:28:40.982483: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublas64_11.dll
2021-06-08 18:28:40.982588: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublasLt64_11.dll
2021-06-08 18:28:40.986795: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cufft64_10.dll
2021-06-08 18:28:40.988451: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library curand64_10.dll
2021-06-08 18:28:40.994115: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusolver64_11.dll
2021-06-08 18:28:40.998408: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusparse64_11.dll
2021-06-08 18:28:41.000573: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudnn64_8.dll
2021-06-08 18:28:41.001094: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-06-08 18:28:41.001651: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-08 18:28:41.003095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:02:00.0 name: GeForce GTX 1070 Ti computeCapability: 6.1
coreClock: 1.683GHz coreCount: 19 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 238.66GiB/s
2021-06-08 18:28:41.003244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-06-08 18:28:42.072538: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-08 18:28:42.072630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2021-06-08 18:28:42.072886: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2021-06-08 18:28:42.075566: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6613 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)
tf.Tensor(641.5694, shape=(), dtype=float32).
I tried to relinstall CUDA toolkits and cuDNN libraries as asked but not helping. uninstalled tensorflow and reinstalled.
The python environment is 3.9
tenosorflow is 2.11.0

Related

Trying to use Tensorflow with RTX 3090 Errors

I'm attempting to use Tensorflow with a Rtx 3090 GPU, however I've been experiencing a variety of issues for several days. I tried the remedies suggested here and in other places, but they didn't work. Either a kernel error occurs, or the program proceeds with the CPU without seeing the GPU. Could you please assist me?
2021󈚰󈚺 13:21:07.654550: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll
2021󈚰󈚺 13:21:09.144192: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance‑critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021󈚰󈚺 13:21:09.149726: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library nvcuda.dll
2021󈚰󈚺 13:21:09.172491: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:08:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.74GHz coreCount: 82 deviceMemorySize: 24.00GiB deviceMemoryBandwidth: 871.81GiB/s
2021󈚰󈚺 13:21:09.173145: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudart64_110.dll
2021󈚰󈚺 13:21:09.201143: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublas64_11.dll
2021󈚰󈚺 13:21:09.201496: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublasLt64_11.dll
2021󈚰󈚺 13:21:09.218490: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cufft64_10.dll
2021󈚰󈚺 13:21:09.222724: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library curand64_10.dll
2021󈚰󈚺 13:21:09.253841: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusolver64_11.dll
2021󈚰󈚺 13:21:09.272022: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cusparse64_11.dll
2021󈚰󈚺 13:21:09.272867: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudnn64_8.dll
2021󈚰󈚺 13:21:09.273229: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021󈚰󈚺 13:21:09.715332: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021󈚰󈚺 13:21:09.715688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2021󈚰󈚺 13:21:09.715891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2021󈚰󈚺 13:21:09.716223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/device:GPU:0 with 18786 MB memory) ‑> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:08:00.0, compute capability: 8.6)
2021󈚰󈚺 13:21:10.046619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:08:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.74GHz coreCount: 82 deviceMemorySize: 24.00GiB deviceMemoryBandwidth: 871.81GiB/s
2021󈚰󈚺 13:21:10.047281: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021󈚰󈚺 13:21:10.047754: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:08:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.74GHz coreCount: 82 deviceMemorySize: 24.00GiB deviceMemoryBandwidth: 871.81GiB/s
2021󈚰󈚺 13:21:10.048414: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021󈚰󈚺 13:21:10.048707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021󈚰󈚺 13:21:10.049027: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2021󈚰󈚺 13:21:10.049227: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2021󈚰󈚺 13:21:10.049491: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 18786 MB memory) ‑> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:08:00.0, compute capability: 8.6)
2021󈚰󈚺 13:21:10.928282: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021󈚰󈚺 13:21:25.315947: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudnn64_8.dll
These are just informational messages as they are prefixed with I, if it is the error message they would be prefixed with E or W for warnings are as shown below:
2020-12-30 21:30:27.549172: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cupti64_101.dll
2020-12-30 21:30:27.599977: W tensorflow/core/framework/allocator.cc:101] Allocation of 37171200 exceeds 10% of system memory.
2021-12-30 21:30:27.704083: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1307] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGES
You can surpass these warnings using below code:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
You can also check executing this code:
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

no kernel image is available for execution on the device Fatal Python error: Aborted

I want to run yolov4 code in this repo: https://github.com/hunglc007/tensorflow-yolov4-tflite
And I installed python 3.7 and all requirements and cuda and cudnn.
By the log, the cudnn and cuda is installed well, but there is error of "no kernel image is available for execution on the device" what is this error? is it related in cuda or cudnn version error?
Python: 3.7.9, CUDA: 10.1, Tensorflow:2.3.0rc0, Tensorflow-GPU:not installed, CUDNN:7.5.0, OS: Windows10(x64)
py -3.7 save_model.py --weights ./data/yolov4.weights --output ./checkpoints/yolov4-416-tflite --input_size 416 --model yolov4 --framework tflite
2020-09-03 11:02:05.897607: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-03 11:02:09.504648: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-09-03 11:02:09.997508: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 940MX computeCapability: 5.0
coreClock: 1.2415GHz coreCount: 3 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 13.41GiB/s
2020-09-03 11:02:10.017273: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-03 11:02:10.036505: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-09-03 11:02:10.059534: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-09-03 11:02:10.074749: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-09-03 11:02:10.094710: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-09-03 11:02:10.115167: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-09-03 11:02:10.140633: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-09-03 11:02:10.148636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-03 11:02:10.155846: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-09-03 11:02:10.188413: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x295adc030a0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-03 11:02:10.199421: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-03 11:02:10.207675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 940MX computeCapability: 5.0
coreClock: 1.2415GHz coreCount: 3 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 13.41GiB/s
2020-09-03 11:02:10.222939: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-03 11:02:10.231890: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-09-03 11:02:10.241896: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-09-03 11:02:10.250393: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-09-03 11:02:10.260177: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-09-03 11:02:10.268644: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-09-03 11:02:10.278132: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-09-03 11:02:10.286635: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-03 11:02:10.380510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-03 11:02:10.388703: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-09-03 11:02:10.394562: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2020-09-03 11:02:10.402323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1464 MB memory) -> physical GPU (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute capability: 5.0)
2020-09-03 11:02:10.429701: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x295ae120140 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-03 11:02:10.441631: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce 940MX, Compute Capability 5.0
2020-09-03 11:02:10.619742: F .\tensorflow/core/kernels/random_op_gpu.h:232] Non-OK-status: GpuLaunchKernel(FillPhiloxRandomKernelLaunch<Distribution>, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: no kernel image is available for execution on the device
Fatal Python error: Aborted
The error indicates that the pre-built binary used in tensorflow, does not support the SM version (compute capability) supported by your actual hardware.
You can refer to below link for supported combinations:
https://www.tensorflow.org/install/source_windows#gpu
Based on this, both 2.1.0 and 2.3.0 require CUDNN 7.4 and CUDA 10.1. You should try with these supported combinations.
[2.3.0 release/rc2/rc0 specific] from https://github.com/tensorflow/tensorflow/releases/tag/v2.3.0 - TF 2.3 includes PTX kernels only for compute capability 7.0 to reduce the TF pip binary size. Earlier releases included PTX for a variety of older compute capabilities.

Tensorflow see's GPU but only uses xla_cpu and crashes when told to use xla_gpu

I was training my models when it felt like they were running very slowly. After some digging I noticed that device GPU 0 is type xla_cpu and is not going through my gpu. device xla_gpu is listed but when forcing tensforflow to use it just crashes saying it can't find ptaxs.
For environment information please scroll past the error block.
The is the crash report while using the line of code:
whith tf.device('/device:XLA_GPU:0'): model.fit(dataSetGenerator, epochs=1, steps_per_epoch=self.steps_per_epoch)
I get the following error
2020-05-25 15:11:23.103677: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 Ti computeCapability: 6.1
coreClock: 1.62GHz coreCount: 6 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-05-25 15:11:23.104045: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-25 15:11:23.104234: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-25 15:11:23.104414: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-25 15:11:23.104597: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-25 15:11:23.104761: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-25 15:11:23.104977: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-25 15:11:23.105248: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-25 15:11:23.105810: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-05-25 15:11:23.106793: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 Ti computeCapability: 6.1
coreClock: 1.62GHz coreCount: 6 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-05-25 15:11:23.107632: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-25 15:11:23.107840: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-25 15:11:23.107957: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-25 15:11:23.108142: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-25 15:11:23.108257: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-25 15:11:23.108371: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-25 15:11:23.108487: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-25 15:11:23.108877: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-05-25 15:11:23.109054: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-25 15:11:23.109592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-05-25 15:11:23.109666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-05-25 15:11:23.109998: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2990 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-05-25 15:12:20.827825: W tensorflow/compiler/jit/xla_device.cc:398] XLA_GPU and XLA_CPU devices are deprecated and will be removed in subsequent releases. Instead, use either #tf.function(experimental_compile=True) for must-compile semantics, or run with TF_XLA_FLAGS=--tf_xla_auto_jit=2 for auto-clustering best-effort compilation.
2020-05-25 15:12:24.752427: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:70] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version. Custom ptxas location can be specified using $PATH.
2020-05-25 15:12:24.752721: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:71] Searched for CUDA in the following directories:
2020-05-25 15:12:24.752826: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] ./cuda_sdk_lib
2020-05-25 15:12:24.752907: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1
2020-05-25 15:12:24.753020: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] .
2020-05-25 15:12:24.753091: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:76] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-05-25 15:12:24.753332: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:197] Couldn't read CUDA driver version.
2020-05-25 15:12:24.753533: I tensorflow/compiler/jit/xla_compilation_cache.cc:241] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
2020-05-25 15:12:24.785710: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:70] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version. Custom ptxas location can be specified using $PATH.
2020-05-25 15:12:24.786148: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:71] Searched for CUDA in the following directories:
2020-05-25 15:12:24.786295: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] ./cuda_sdk_lib
2020-05-25 15:12:24.786424: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.1
2020-05-25 15:12:24.786541: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74] .
2020-05-25 15:12:24.786617: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:76] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
Traceback (most recent call last):
File "D:/PiChess/Core/Main.py", line 68, in <module>
ForEx10d5w5m.TrainActiveModels()
File "D:\PiChess\Core\DatasetTemplates\ForEx_10d_5w_5m.py", line 705, in TrainActiveModels
with tf.device('/device:XLA_GPU:0'): model.fit(dataSetGenerator, epochs=1, steps_per_epoch=self.steps_per_epoch)
File "D:\PiChess\Core\venv\lib\site-packages\tensorflow\python\keras\engine\training.py", line 66, in _method_wrapper
return method(self, *args, **kwargs)
File "D:\PiChess\Core\venv\lib\site-packages\tensorflow\python\keras\engine\training.py", line 848, in fit
tmp_logs = train_function(iterator)
File "D:\PiChess\Core\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 580, in __call__
result = self._call(*args, **kwds)
File "D:\PiChess\Core\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 644, in _call
return self._stateless_fn(*args, **kwds)
File "D:\PiChess\Core\venv\lib\site-packages\tensorflow\python\eager\function.py", line 2420, in __call__
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "D:\PiChess\Core\venv\lib\site-packages\tensorflow\python\eager\function.py", line 1665, in _filtered_call
self.captured_inputs)
File "D:\PiChess\Core\venv\lib\site-packages\tensorflow\python\eager\function.py", line 1746, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "D:\PiChess\Core\venv\lib\site-packages\tensorflow\python\eager\function.py", line 598, in call
ctx=ctx)
File "D:\PiChess\Core\venv\lib\site-packages\tensorflow\python\eager\execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Function invoked by the following node is not compilable: {{node __inference_train_function_1659}} = __inference_train_function_1659[_XlaMustCompile=true, config_proto="\n\007\n\003CPU\020\001\n\007\n\003GPU\020\0012\005*\0010J\0008\001", executor_type=""](dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input, dummy_input).
Uncompilable nodes:
IteratorGetNext: unsupported op: No registered 'IteratorGetNext' OpKernel for XLA_GPU_JIT devices compatible with node {{node IteratorGetNext}}
Stacktrace:
Node: __inference_train_function_1659, function:
Node: IteratorGetNext, function: __inference_train_function_1659
[Op:__inference_train_function_1659]
About CUDA
PS C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:26_Pacific_Standard_Time_2019
Cuda compilation tools, release 10.1, V10.1.105
For cudnn I have the following information
PS C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin> type "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include\cudnn.h" | findstr "CUDNN_MAJOR CUDNN_MINOR CUDNN_PATCHLEVEL"
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
#define CUDNN_PATCHLEVEL 5
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
I used the following python script to get hopefully more useful information to help debug this.
import tensorflow as tf
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
# sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print(tf.__version__)
#os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
print(tf.test.is_gpu_available())
print(tf.test.is_gpu_available(
cuda_only=False,
min_cuda_compute_capability=None
))
print(tf.test.is_gpu_available(
cuda_only=True,
min_cuda_compute_capability=None
))
print(tf.test.is_built_with_cuda())
import sys
print(sys.version)
import ctypes
print(ctypes.WinDLL('cudnn64_7.dll'))
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
This gave me the following output.
D:\PiChess\Core\venv\Scripts\python.exe D:/PiChess/Core/test2.py
2020-05-25 15:09:57.190323: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-25 15:09:59.098417: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-05-25 15:09:59.106107: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x256e9b45470 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-25 15:09:59.106369: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-05-25 15:09:59.107813: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-05-25 15:09:59.332317: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 Ti computeCapability: 6.1
coreClock: 1.62GHz coreCount: 6 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-05-25 15:09:59.332965: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-25 15:09:59.336063: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-25 15:09:59.340072: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-25 15:09:59.340908: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-25 15:09:59.343596: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-25 15:09:59.345793: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-25 15:09:59.351064: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-25 15:09:59.351981: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-05-25 15:09:59.937899: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-25 15:09:59.938075: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-05-25 15:09:59.938167: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-05-25 15:09:59.938673: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 2990 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-05-25 15:09:59.941582: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2568c2c4ab0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-05-25 15:09:59.941737: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1050 Ti, Compute Capability 6.1
WARNING:tensorflow:From D:/PiChess/Core/test2.py:11: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2020-05-25 15:09:59.943635: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 Ti computeCapability: 6.1
coreClock: 1.62GHz coreCount: 6 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-05-25 15:09:59.943963: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-25 15:09:59.944147: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-25 15:09:59.944321: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-25 15:09:59.944476: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-25 15:09:59.944615: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-25 15:09:59.944739: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-25 15:09:59.944877: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-25 15:09:59.945388: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-05-25 15:09:59.945684: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-25 15:09:59.945803: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-05-25 15:09:59.945877: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-05-25 15:09:59.946315: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 2990 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-05-25 15:09:59.947867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 Ti computeCapability: 6.1
coreClock: 1.62GHz coreCount: 6 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-05-25 15:09:59.948101: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-25 15:09:59.948249: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-25 15:09:59.948364: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-25 15:09:59.948479: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-25 15:09:59.948596: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-25 15:09:59.948734: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-25 15:09:59.948863: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-25 15:09:59.949266: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-05-25 15:09:59.949555: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-25 15:09:59.949672: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-05-25 15:09:59.949742: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-05-25 15:09:59.950111: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 2990 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-05-25 15:09:59.951438: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 Ti computeCapability: 6.1
coreClock: 1.62GHz coreCount: 6 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-05-25 15:09:59.951764: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-05-25 15:09:59.951930: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-25 15:09:59.952125: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-25 15:09:59.952412: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-25 15:09:59.952587: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-25 15:09:59.952763: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-25 15:09:59.952939: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-25 15:09:59.953318: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-05-25 15:09:59.953440: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-25 15:09:59.953558: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-05-25 15:09:59.953632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-05-25 15:09:59.953935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 2990 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-05-25 15:09:59.955052: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 Ti computeCapability: 6.1
coreClock: 1.62GHz coreCount: 6 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-05-25 15:09:59.955586: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 3593757145487346910
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 15151200481806228159
physical_device_desc: "device: XLA_CPU device"
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 3136264601
locality {
bus_id: 1
links {
}
}
incarnation: 17283739609840781326
physical_device_desc: "device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 2207722455070197847
physical_device_desc: "device: XLA_GPU device"
]
2.2.0
True
True
True
True
3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) [MSC v.1916 64 bit (AMD64)]
<WinDLL 'cudnn64_7.dll', handle 7ffe4b520000 at 0x256d1f56e80>
Num GPUs Available: 1
2020-05-25 15:09:59.955763: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-05-25 15:09:59.955935: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-05-25 15:09:59.956155: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-05-25 15:09:59.956357: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-05-25 15:09:59.956535: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-05-25 15:09:59.956719: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-05-25 15:09:59.957138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
Process finished with exit code 0
If you have executed tf.config.list_physical_devices('GPU') and if it returns device_type='GPU' means, there is no issue with your Tensorflow GPU installation.
Replace XLA_GPA:0 with GPU:0 as below to avoid this error.
with tf.device('/device:GPU:0'):
history=model.fit(dataSetGenerator,
epochs=1,
steps_per_epoch=self.steps_per_epoch)

Long GPU training execution

I've downloaded all the software programs to execute Keras with GPU (CUDA/cuDNN). It seems to work as you can see on the code below. This is a CNN and when I try to train my model, training for an epoch lasts 10 minutes (the same duration as a CPU running).
Using TensorFlow backend.
2020-04-08 23:19:03.654388: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-08 23:19:08.302815: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-04-08 23:19:08.357836: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-04-08 23:19:09.085263: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 computeCapability: 6.1
coreClock: 1.493GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-04-08 23:19:09.093385: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-08 23:19:09.188055: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-08 23:19:09.234655: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-04-08 23:19:09.260896: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-04-08 23:19:09.345531: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-04-08 23:19:09.396772: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-04-08 23:19:09.540946: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-08 23:19:09.548839: F tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Internal: failed to get device attribute 13 for device 0: CUDA_ERROR_UNKNOWN: unknown error
(.envWindows) C:\Users\Florian\Desktop\Ptrans - Smart Parking\u_park>python script_training.py
Using TensorFlow backend.
2020-04-08 23:19:29.353347: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-08 23:19:31.572314: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-04-08 23:19:31.628773: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-04-08 23:19:32.276405: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 computeCapability: 6.1
coreClock: 1.493GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-04-08 23:19:32.283139: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-08 23:19:32.290215: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-08 23:19:32.297466: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-04-08 23:19:32.302702: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-04-08 23:19:32.311060: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-04-08 23:19:32.317673: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-04-08 23:19:32.332136: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-08 23:19:32.344553: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-08 23:19:33.192037: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-08 23:19:33.196639: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-04-08 23:19:33.199622: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-04-08 23:19:33.203658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 1337 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 4888491968132995736
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 1401988300
locality {
bus_id: 1
links {
}
}
incarnation: 16691478652768016864
physical_device_desc: "device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1"
]
2020-04-08 23:19:33.293485: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 computeCapability: 6.1
coreClock: 1.493GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-04-08 23:19:33.300364: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-08 23:19:33.303574: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-08 23:19:33.306443: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-04-08 23:19:33.309228: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-04-08 23:19:33.312697: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-04-08 23:19:33.316427: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-04-08 23:19:33.321711: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-08 23:19:33.325715: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-08 23:19:33.334856: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 computeCapability: 6.1
coreClock: 1.493GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-04-08 23:19:33.349585: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-08 23:19:33.356020: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-08 23:19:33.359692: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-04-08 23:19:33.371302: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-04-08 23:19:33.374765: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-04-08 23:19:33.386556: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-04-08 23:19:33.390107: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-08 23:19:33.401618: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-08 23:19:33.404151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-08 23:19:33.412276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-04-08 23:19:33.418029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-04-08 23:19:33.421211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1337 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
Found 94493 images belonging to 2 classes.
Found 18647 images belonging to 2 classes.
Found 1888 images belonging to 2 classes.
--- Start fit cycle : NewMNetV2_finetuning_unfreezed ---
Cycle 1
Epoch 1/1
2020-04-08 23:19:46.672284: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-08 23:19:47.032341: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-08 23:19:48.284266: W tensorflow/stream_executor/gpu/redzone_allocator.cc:312] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only
Relying on driver to perform ptx compilation. This message will be only logged once.
2215/2952 [=====================>........] - ETA: 2:35 - loss: 0.1830 - accuracy: 0.9647Traceback (most recent call last):
Hardware :
my PC --> LENOVO Legion Y520-15IKBN 80WK
GPU --> Nvidia GeForce GTX 1050
CPU --> i5-7300HQ

keras error when trying to get intermediate layer output: Could not create cudnn handle

I am building a model using keras.
I am using:
anaconda (python 3.7)
tensorflow-gpu (2.1)
keras (2.3.1)
cuda (10.1.2)
cudnn (7.6.5)
nvidia driver (445.7)
nvidia gpu: gtx 1660Ti (6GB)
when I am trying to run a model, there is a code that creates an error:
def get_gen_output(gan, noise):
intermediate_model=Model(inputs=gan.input,outputs=gan.layers[24].output)
layer_output = intermediate_model.predict(noise)
return layer_output[0]
this model is a CNN gan. I can run other CNN models well, only this model creates a problem.
the error I get is:
Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
from other questions that faces the same problem, I see that there are two common things that can cause it:
insufficient gpu memory - but I dont think this is the problem, since even if I create a very small model that includes the code snippet from above the error appears. and bigger models without this code work well.
problem with cuda and cudnn compatibility - but based on this link, the version I listed above should work.
any idea what could be the problem and how to fix it? I have been trying to solve this for days now.
if any more information is needed (summary of the model for example), please let me know in the comments and I will add it.
UPDATE: a comment asked me to post the logs:
(base) C:\Users\Moran>ju[yter notebook
'ju[yter' is not recognized as an internal or external command,
operable program or batch file.
(base) C:\Users\Moran>jupyter notebook
[I 16:42:41.966 NotebookApp] Serving notebooks from local directory: C:\Users\Moran
[I 16:42:41.967 NotebookApp] The Jupyter Notebook is running at:
[I 16:42:41.967 NotebookApp] http://localhost:8888/?token=ec3a664897f7d31597f7f4544609cc8c0d7b4db7450b55b1
[I 16:42:41.967 NotebookApp] or http://127.0.0.1:8888/?token=ec3a664897f7d31597f7f4544609cc8c0d7b4db7450b55b1
[I 16:42:41.967 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 16:42:42.000 NotebookApp]
To access the notebook, open this file in a browser:
file:///C:/Users/Moran/AppData/Roaming/jupyter/runtime/nbserver-15820-open.html
Or copy and paste one of these URLs:
http://localhost:8888/?token=ec3a664897f7d31597f7f4544609cc8c0d7b4db7450b55b1
or http://127.0.0.1:8888/?token=ec3a664897f7d31597f7f4544609cc8c0d7b4db7450b55b1
[I 16:42:47.284 NotebookApp] Kernel started: ae448b14-33fc-471e-a2ae-991be8321434
[W 16:42:47.740 NotebookApp] 404 GET /api/kernels/4ce83e1e-9aa5-4c93-97d8-55dc16480242/channels?session_id=eaa90dc2c0bb4c448d6a01d66f4fbb21 (127.0.0.1): Kernel does not exist: 4ce83e1e-9aa5-4c93-97d8-55dc16480242
[W 16:42:47.757 NotebookApp] 404 GET /api/kernels/4ce83e1e-9aa5-4c93-97d8-55dc16480242/channels?session_id=eaa90dc2c0bb4c448d6a01d66f4fbb21 (127.0.0.1) 18.94ms referer=None
[W 16:42:49.439 NotebookApp] 404 GET /api/kernels/b9e9b610-9c5b-4565-8b85-deb70837c31f/channels?session_id=34072dd627c74e96b496ef73d99601a9 (::1): Kernel does not exist: b9e9b610-9c5b-4565-8b85-deb70837c31f
[W 16:42:49.440 NotebookApp] 404 GET /api/kernels/b9e9b610-9c5b-4565-8b85-deb70837c31f/channels?session_id=34072dd627c74e96b496ef73d99601a9 (::1) 2.00ms referer=None
2020-04-12 16:43:00.321827: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-12 16:43:02.652473: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-04-12 16:43:02.685848: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1660 Ti computeCapability: 7.5
coreClock: 1.59GHz coreCount: 24 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 268.26GiB/s
2020-04-12 16:43:02.693105: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-12 16:43:02.700970: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-12 16:43:02.708335: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-04-12 16:43:02.713049: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-04-12 16:43:02.720598: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-04-12 16:43:02.726428: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-04-12 16:43:02.738007: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-12 16:43:02.741940: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-12 16:43:02.745942: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-04-12 16:43:02.754621: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1660 Ti computeCapability: 7.5
coreClock: 1.59GHz coreCount: 24 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 268.26GiB/s
2020-04-12 16:43:02.761464: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-12 16:43:02.766394: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-12 16:43:02.770257: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-04-12 16:43:02.773975: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-04-12 16:43:02.777827: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-04-12 16:43:02.782949: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-04-12 16:43:02.786952: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-12 16:43:02.791207: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-12 16:43:03.372450: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-12 16:43:03.376375: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-04-12 16:43:03.379436: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-04-12 16:43:03.382400: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 4625 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-04-12 16:43:03.966022: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1660 Ti computeCapability: 7.5
coreClock: 1.59GHz coreCount: 24 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 268.26GiB/s
2020-04-12 16:43:03.976011: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-04-12 16:43:03.980766: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-04-12 16:43:03.985179: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-04-12 16:43:03.988922: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-04-12 16:43:03.992744: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-04-12 16:43:03.997758: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-04-12 16:43:04.001856: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-12 16:43:04.006936: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-12 16:43:04.009739: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-12 16:43:04.014702: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-04-12 16:43:04.017351: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-04-12 16:43:04.020371: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4625 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1660 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
[W 16:43:04.449 NotebookApp] Replacing stale connection: 4ce83e1e-9aa5-4c93-97d8-55dc16480242:eaa90dc2c0bb4c448d6a01d66f4fbb21
2020-04-12 16:43:05.280820: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-04-12 16:43:06.518456: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-04-12 16:43:06.522375: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED
2020-04-12 16:43:06.525103: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node 1/convolution}}]]
[W 16:43:06.741 NotebookApp] Replacing stale connection: b9e9b610-9c5b-4565-8b85-deb70837c31f:34072dd627c74e96b496ef73d99601a9
[I 16:43:08.454 NotebookApp] Saving file at /generative models/GAN.ipynb
Kindly remove nvidia cuda toolkit from both anaconda environment as well as system.
sudo apt-get remove nvidia-cuda-toolkit
conda remove cudatoolkit
And, use the following option while calling tensorflow session
Tensorflow
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)
For keras,
from keras.backend.tensorflow_backend import set_session
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
set_session(sess) # set this TensorFlow session as the default session for Keras