An error ocurred while starting the kernel - tensorflow

I have installed all necessary software (opencv, tensorflow-gpu, matplotlib, scikit-learn, pandas, keras 2) to run my code and validated them each. I am using Spyder as an IDE and going to train CNN in Keras with Tensorflow backend. I could run my code snippets until I reach training stage:
hist = model.fit(X_train, y_train, batch_size=32, nb_epoch=num_epoch, verbose=1, validation_data=(X_test, y_test))
When I run this line training somewhat starts and instead of displaying the epochs and other attributes (val_acc, training_acc, etc) the kernel suddenly dies, then re-connects to kernel and dies again, etc.
At the end I get this error:
2018󈚧󈚭 16:25:49.961500: I C:\tf_jenkins\workspace\rel‑win\M\windows‑gpu\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018󈚧󈚭 16:25:50.664501: I C:\tf_jenkins\workspace\rel‑win\M\windows‑gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1212] Found device 0 with properties:
name: GeForce GT 740 major: 3 minor: 0 memoryClockRate(GHz): 1.0715
pciBusID: 0000:01:00.0
totalMemory: 1.00GiB freeMemory: 756.79MiB
2018󈚧󈚭 16:25:50.664501: I C:\tf_jenkins\workspace\rel‑win\M\windows‑gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1312] Adding visible gpu devices: 0
2018󈚧󈚭 16:25:51.148102: I C:\tf_jenkins\workspace\rel‑win\M\windows‑gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:993] Creating TensorFlow device (/device:GPU:0 with 501 MB memory) ‑> physical GPU (device: 0, name: GeForce GT 740, pci bus id: 0000:01:00.0, compute capability: 3.0)
2018󈚧󈚭 16:27:22.549779: I C:\tf_jenkins\workspace\rel‑win\M\windows‑gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1312] Adding visible gpu devices: 0
2018󈚧󈚭 16:27:22.549779: I C:\tf_jenkins\workspace\rel‑win\M\windows‑gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 224 MB memory) ‑> physical GPU (device: 0, name: GeForce GT 740, pci bus id: 0000:01:00.0, compute capability: 3.0)
2018󈚧󈚭 16:27:43.118021: E C:\tf_jenkins\workspace\rel‑win\M\windows‑gpu\PY\35\tensorflow\stream_executor\cuda\cuda_dnn.cc:378] Loaded runtime CuDNN library: 7101 (compatibility version 7100) but source was compiled with 7003 (compatibility version 7000). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.
2018󈚧󈚭 16:27:43.164821: F C:\tf_jenkins\workspace\rel‑win\M\windows‑gpu\PY\35\tensorflow\core\kernels\conv_ops.cc:717] Check failed: stream‑>parent()‑>GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)
I though it is a Spyder problem and issued on github and received a reply that is not Spyder-related but compatibility problem
I searched the web hoping I could find solution to this, but it seems there is no exact same issue. (at least among I came across)
If there is someone who had the same problem, help me please.
What am I supposed to do?

I had the same issue when l was using the Jupyter notebook, the fix for me was changing the browser l was running the code with.
If using a different IDE (e.g. Jupyter notebook, pycharm) does not work l would recommend running the script from terminal/command prompt.

Related

Process finished with exit code -1073740791 (0xC0000409) tensorflow-gpu

Here is the code I am using:
import math
import pandas_datareader as web
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM
import matplotlib.pyplot as plt
Stock = 'BTC-USD'
#Get the stock quote
df = web.DataReader(Stock, data_source='yahoo', start='2016-01-01', end='2020-12-17')
#Show the Data
#print(df)
#Get the number of rows and columns in the data set
#print(df.shape)
#Visualize the closing price history
plt.figure(figsize=(16,8))
plt.title("Close Price History")
plt.plot(df['Close'])
plt.xlabel('Data', fontsize=18)
plt.ylabel('Close Price USD ($)', fontsize=18)
plt.show()
print(1)
#Create a new Dataframe with only the 'close column'
data = df.filter(['Close'])
#Convert the dataframe to a numpy array
dataset = data.values
#Get the number of rows to train the model on
training_data_len = math.ceil(len(dataset) * .8)
print(2)
#print(training_data_len)
#Scale the data
scaler = MinMaxScaler(feature_range=(0,1))
scaled_data = scaler.fit_transform(dataset)
print(3)
#print(scaled_data)
#Create the training data set
#Create the scaled training data set
train_data = scaled_data[0:training_data_len, :]
#Split the data into x_train and y_train data sets
x_train = []
y_train = []
print(4)
for i in range(60, len(train_data)):
x_train.append(train_data[i-60:i, 0])
y_train.append(train_data[i, 0])
if i<= 61:
print(x_train)
print(y_train)
print()
print(5)
#Convert the x_train and y_train to numpy arrays
x_train, y_train = np.array(x_train), np.array(y_train)
print(6)
#Reshape the x_train data set
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
#print(x_train.shape)
print(7)
#Build the LSTM model
model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape=(x_train.shape[1], 1)))
model.add(LSTM(50, return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))
print(8)
#Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')
print(9)
#Train the model
model.fit(x_train, y_train, batch_size=1, epochs=1)
print(10)
#Create the test data set
#Create a new array containing scaled values from index 1543 to 2003
test_data = scaled_data[training_data_len - 60: 2003]
#Create the data sets x_test and y_test
x_test = []
y_test = dataset[training_data_len:, :]
for i in range(60, len(test_data)):
x_test.append(test_data[i-60:i, 0])
print(11)
#Convert the data to a numpy array
x_test = np.array(x_test)
#print(x_test.shape)
#Reshape the data
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))
#Get the models predicted price values
predictions = model.predict(x_test)
predictions = scaler.inverse_transform(predictions)
#Evaluate the model. Get the root mean squared error (RMSE)
rmse = np.sqrt(np.mean(predictions - y_test)**2)
print(rmse)
#Plot the data
train = data[:training_data_len]
valid = data[training_data_len: ]
valid['Predictions'] = predictions
#Visualize the data
plt.figure(figsize=(16,8))
plt.title('Model')
plt.xlabel('Date', fontsize=18)
plt.ylabel('Close Price USD ($)', fontsize=18)
plt.plot(train['Close'])
plt.plot(valid[['Close', 'Predictions']])
plt.legend(['Train', 'Val', 'Predictions'], loc='lower right')
plt.show()
#Show the valid and predicted prices
print(valid)
It stops between 9 and 10 if you look at my "print" debugging method and this is the output:
C:\Users\gunne\anaconda3\envs\EnvBioWell\python.exe C:/Users/gunne/PycharmProjects/pythonProject/Stocks_Price.py
2021-02-19 09:57:52.322041: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
1
2
3
4
[array([0.00307386, 0.00303452, 0.00288404, 0.00301928, 0.00296962,
0.00284426, 0.00411515, 0.00390359, 0.00365686, 0.00367355,
0.00369274, 0.00313341, 0.00298767, 0.00289699, 0. ,
0.00101894, 0.00078898, 0.00100278, 0.00069457, 0.00245455,
0.00201685, 0.00079746, 0.00101697, 0.0016967 , 0.00120293,
0.00122168, 0.00134546, 0.00070072, 0.00066494, 0.00061141,
0.00019479, 0.00038312, 0.00044424, 0.00024669, 0.00110931,
0.0009756 , 0.00053531, 0.00053962, 0.00040029, 0.00051366,
0.00076044, 0.00067284, 0.00087522, 0.00120881, 0.00188371,
0.00157436, 0.00189504, 0.00228295, 0.00254865, 0.00247892,
0.00319813, 0.00326988, 0.00322377, 0.00247677, 0.00266203,
0.00264398, 0.00297805, 0.00299417, 0.00303742, 0.00322153])]
[0.003108507179665692]
[array([0.00307386, 0.00303452, 0.00288404, 0.00301928, 0.00296962,
0.00284426, 0.00411515, 0.00390359, 0.00365686, 0.00367355,
0.00369274, 0.00313341, 0.00298767, 0.00289699, 0. ,
0.00101894, 0.00078898, 0.00100278, 0.00069457, 0.00245455,
0.00201685, 0.00079746, 0.00101697, 0.0016967 , 0.00120293,
0.00122168, 0.00134546, 0.00070072, 0.00066494, 0.00061141,
0.00019479, 0.00038312, 0.00044424, 0.00024669, 0.00110931,
0.0009756 , 0.00053531, 0.00053962, 0.00040029, 0.00051366,
0.00076044, 0.00067284, 0.00087522, 0.00120881, 0.00188371,
0.00157436, 0.00189504, 0.00228295, 0.00254865, 0.00247892,
0.00319813, 0.00326988, 0.00322377, 0.00247677, 0.00266203,
0.00264398, 0.00297805, 0.00299417, 0.00303742, 0.00322153]), array([0.00303452, 0.00288404, 0.00301928, 0.00296962, 0.00284426,
0.00411515, 0.00390359, 0.00365686, 0.00367355, 0.00369274,
0.00313341, 0.00298767, 0.00289699, 0. , 0.00101894,
0.00078898, 0.00100278, 0.00069457, 0.00245455, 0.00201685,
0.00079746, 0.00101697, 0.0016967 , 0.00120293, 0.00122168,
0.00134546, 0.00070072, 0.00066494, 0.00061141, 0.00019479,
0.00038312, 0.00044424, 0.00024669, 0.00110931, 0.0009756 ,
0.00053531, 0.00053962, 0.00040029, 0.00051366, 0.00076044,
0.00067284, 0.00087522, 0.00120881, 0.00188371, 0.00157436,
0.00189504, 0.00228295, 0.00254865, 0.00247892, 0.00319813,
0.00326988, 0.00322377, 0.00247677, 0.00266203, 0.00264398,
0.00297805, 0.00299417, 0.00303742, 0.00322153, 0.00310851])]
[0.003108507179665692, 0.0026196096172032522]
5
6
7
2021-02-19 09:57:55.479350: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-02-19 09:57:55.479830: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2021-02-19 09:57:55.506019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:0e:00.0 name: GeForce RTX 3080 computeCapability: 8.6
coreClock: 1.74GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s
2021-02-19 09:57:55.506184: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-02-19 09:57:55.509645: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-02-19 09:57:55.509728: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-02-19 09:57:55.511906: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-02-19 09:57:55.512542: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-02-19 09:57:55.516144: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-02-19 09:57:55.517468: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-02-19 09:57:55.518014: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-02-19 09:57:55.518133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-02-19 09:57:55.518454: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-02-19 09:57:55.519487: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:0e:00.0 name: GeForce RTX 3080 computeCapability: 8.6
coreClock: 1.74GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s
2021-02-19 09:57:55.519678: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-02-19 09:57:55.519763: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-02-19 09:57:55.519836: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-02-19 09:57:55.519915: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-02-19 09:57:55.519995: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-02-19 09:57:55.520066: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-02-19 09:57:55.520138: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-02-19 09:57:55.520212: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-02-19 09:57:55.520304: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-02-19 09:57:56.033929: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-02-19 09:57:56.034021: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0
2021-02-19 09:57:56.034068: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N
2021-02-19 09:57:56.034278: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8444 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3080, pci bus id: 0000:0e:00.0, compute capability: 8.6)
2021-02-19 09:57:56.035130: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
8
9
2021-02-19 09:57:56.610845: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-02-19 09:57:57.789418: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-02-19 09:57:58.358210: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-02-19 09:57:58.377765: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
Process finished with exit code -1073740791 (0xC0000409)
I'm using my RTX 3080 GPU with Cuddn 8.0 and Cuda Toolkit 11.0 and tensorflow-gpu 2.4.0 with python 3.7. Please let me know if anyone has any suggestions! I've tried everything I can think of and when I run code to check for GPU it pulls it up, so it should be working, but I just can't get it to. I had it working a month ago before I messed something up and now it shows all of those time stamps in the output with packages it's opening. Not sure why, but I just need help.
Have you tried this link:
https://www.reddit.com/r/tensorflow/comments/jsalkw/rtx_3090_and_tensorflow_for_windows_10_step_by/
Or have you solved it already?
Here is the content of the link:
The NVIDIA 3000 Series GPUs (Ampere) require CUDA v11 and cuDNN v8 to work. The tensorflow versions on anaconda and pip on Windows (currently at max tensorflow 2.3) do not include a tensorflow built with CUDA v11. But you can use pip to install a nightly build of tensorflow (currently tensorflow 2.5) which built with CUDA v11. Apart from a tensorflow build with CUDA v11, you will also need the actual DLLs for CUDA v11 and cuDNN v8. Normally, you would just install these with anaconda with the packages cudatoolkit and cudnn, but while cudatoolkit is available with v11, for cudnn, at least for Windows, v8 is not available in anaconda. The workaround is to manually get these DLLs and set them in the system environment path (so that python/tensorflow can find and load them). So let's start:
First, install anaconda if you haven't already. Open the anaconda prompt with admin rights.
Type conda create -n tf2 python=3.8 and hit enter to create a new anaconda environment with python 3.8 (the tensorflow nightly build needs python 3.8 or higher, that's why we are using python 3.8)
Type activate tf2 or conda activate tf2 and hit enter to enter that new environment.
Install the nightly tensorflow build with pip3 install tf-nightly-gpu
Install other packages that you might need. For me, it's conda install jupyter scikit-learn matplotlib pandas
Now, download CUDA v11 from NVIDIA (https://developer.nvidia.com/cuda-downloads or https://developer.nvidia.com/cuda-toolkit-archive ). Yeah, the file is pretty big with 3GB.
Additionally, apparently we also need a Microsoft Visual Studio version for C++ for the installer to run properly. Download the free Visual Studio Community Edition (https://visualstudio.microsoft.com/downloads/ ) and install the C++ components. For this, select "Desktop development with C++", select the first 6 options and install. This step is taken from the guide I mentioned earlier , so refer to it if you have trouble with this. For me, I already had Visual Studio with C++ in mind set up on my computer, so I could skip this step.
Now, let's first execute the CUDA v11 installer. Execute it. You can do the express installation, but if you already have GeForce Experience installed, you can also choose the Custom option and deselect everything that you already have installed with a higher version. For me, I only needed the very first checkbox with the CUDA options, so that might be enough.
What the CUDA v11 installer basically did was installing all the CUDA v11 DLLs, Headers, and stuff in the directory "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1" (the version may be different for you). What we will do next: Add the cuDNN DLLs, Headers, etc. in this directory as well and then add this directory to the system path. Ok, let's go.
Download cuDNN from NVIDIA (https://developer.nvidia.com/rdp/cudnn-download ). This file is around 700MB. You need to register as a developer and answer some questions, but don't worry, it's free. When asked for an email, you can type in any email, since in the next page, you will get an option to login using google or facebook as an alternative (which you may or may not prefer). Once you downloaded the file, extract it. Going into the directory, you will see three folders "bin", "include", "lib". Comparing it with the CUDA v11 directory (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1), you'll notice that these directories are present there as well! So just copy the folders from cuDNN to the CUDA v11 directory. Windows will add the files into the existing folders.
Now, let's add those directories to the system path. In windows, open start and search for "This PC". Rightclick and select "Properties" to open a Window called "System". On the left side at the bottom, select "Advanced system settings". Click "Environment Variables..." at the bottom. Here, in the lower half, in "System variables", find and open "Path". Here, click "New" to add a new directory to the system path. Do this every time for each of the following directories (as mentioned earlier, the version number may be different for you). Some of the directories may be already listed there, so feel free to skip them (there is no negative effect from double entries though, so don't worry too much): C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\libnvvp C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\extras\CUPTI\lib64 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include
Now, very important: Restart your system!
Now, run your code to see if everything works. For me, it was through a jupyter notebook. A simple thing to do first is to import tensorflow and check the physical devices:import tensorflow as tftf.config.list_physical_devices()
Your GPU may not show up. Take a close look at the output of the console (for me, it was the anaconda prompt with which I started up my jupyter notebook). There, you should see logs like tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll or a similar log stating that a certain DLL could not be loaded! In my case, everything loaded except the DLL "cusolver64_10.dll". So, I went to the CUDA v11 directory (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1), opened the "bin" folder (the DLLs are in there) to check if that DLL was there. Nope, it was not. Instead, there was "cusolver64_11.dll". So what I did was just copy that DLL and renamed the copy to "cusolver64_10.dll". Yeah, sounds dumb, but after that, everything worked.

tensorflow | GpuLauchKernel() no kernel image is avaliable on the device

I download a program from Github and try to run it on my laptop. I've checked my GPU's computeCapability is 5.0. And I don't know how to solve this. I installed CUDA 10.1 and cudnn 8.0, is this the problem?
2020-08-13 16:03:07.860157: F .\tensorflow/core/kernels/conv_2d_gpu.h:504] Non-OK-status: GpuLaunchKernel(ShuffleInTensor3Simple<T, 2, 1, 0>, config.block_count, config.thread_per_block, 0, d.stream(), config.virtual_thread_count, in.data(), combined_dims, out.data()) status: Internal: no kernel image is available for execution on the device

Tensorflow: device CUDA:0 not supported by XLA service while setting up XLA_GPU_JIT device number 0

I got this when using keras with Tensorflow backend:
tensorflow.python.framework.errors_impl.InvalidArgumentError: device CUDA:0 not supported by XLA service
while setting up XLA_GPU_JIT device number 0
Relevant code:
tfconfig = tf.ConfigProto()
tfconfig.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1
tfconfig.gpu_options.allow_growth = True
K.tensorflow_backend.set_session(tf.Session(config=tfconfig))
tensorflow version: 1.14.0
Chairman Guo's code:
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
solved my problem of jupyter notebook kernel crashing at:
tf.keras.models.load_model(path/to/my/model)
The fatal message was:
2020-01-26 11:31:58.727326: F
tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch
value instead of handling error Internal: failed initializing
StreamExecutor for CUDA device ordinal 0: Internal: failed call to
cuDevicePrimaryCtxRetain: CUDA_ERROR_UNKNOWN: unknown error
My TF's version is: 2.2.0-dev20200123. There are 2 GPUs on this system.
This could be due to your TF-default (i.e. 1st) GPU is running out of memory. If you have multiple GPUs, divert your Python program to run on other GPUs. In TF (suppose using TF-2.0-rc1), set the following:
# Specify which GPU(s) to use
os.environ["CUDA_VISIBLE_DEVICES"] = "1" # Or 2, 3, etc. other than 0
# On CPU/GPU placement
config = tf.compat.v1.ConfigProto(allow_soft_placement=True, log_device_placement=True)
config.gpu_options.allow_growth = True
tf.compat.v1.Session(config=config)
# Note that ConfigProto disappeared in TF-2.0
Suppose, however, your environment have only one GPU, then perhaps you have no choice but ask your buddy to stop his program, then treat him a cup of coffee.

Google Cloud AI Platform Notebook Instance won't use GPU with Jupyter

I'm using the pre-built AI Platform Jupyter Notebook instances to train a model with a single Tesla K80 card. The issue is that I don't believe the model is actually training on the GPU.
nvidia-smi returns the following during training:
No Running Processes Found
Not the "No Running Process Found" yet "Volatile GPU Usage" is 100%. Something seems strange...
...And the training is excruciatingly slow.
A few days ago, I was having issues with the GPU not being released after each notebook run. When this occurred I would receive a OOM (Out of memory error). This required me to go into the console every time, find the GPU running process PID and use kill -9 before re-running the notebook. However, today, I can't get the GPU to run at all? It never shows a running process.
I've tried 2 different GCP AI Platform Notebook instances (both of the available tensorflow version options) with no luck. Am I missing something with these "pre-built" instances.
Pre-Built AI Platform Notebook Section
Just to clarify, I did not build my own instance and then install access to Jupyter notebooks. Instead, I used the built-in Notebook instance option under the AI Platform submenu.
Do I still need to configure a setting somewhere or install a library to continue using/reset my chosen GPU? I was under the impression that the virtual machine was already loaded with the Nvidia stack and should be plug and play with GPUs.
Thoughts?
EDIT: Here is a full video of the issue as requested --> https://www.youtube.com/watch?v=N5Zx_ZrrtKE&feature=youtu.be
Generally speaking, you'll want to try to debug issues like this using the smallest possible bit of code that could reproduce your error. That removes many possible causes for the issue you're seeing.
In this case, you can check if your GPUs are being used by running this code (copied from the TensorFlow 2.0 GPU instructions):
import tensorflow as tf
print("GPU Available: ", tf.test.is_gpu_available())
tf.debugging.set_log_device_placement(True)
# Create some tensors
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)
print(c)
Running it on the same TF 2.0 Notebook gives me the output:
GPU Available: True
Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0
tf.Tensor(
[[22. 28.]
[49. 64.]], shape=(2, 2), dtype=float32)
That right there shows that it's using the GPUs
Similarly, if you need more evidence, running nvidia-smi gives the output:
jupyter#tf2:~$ nvidia-smi
Tue Jul 30 00:59:58 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104 Driver Version: 410.104 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:00:04.0 Off | 0 |
| N/A 36C P0 58W / 149W | 10900MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 7852 C /usr/bin/python3 10887MiB |
+-----------------------------------------------------------------------------+
So why isn't your code using GPUs? You're using a library someone else wrote, probably for tutorial purposes. Most likely those library functions are doing something that is causing CPUs to be used instead of GPUs.
You'll want to debug that code directly.

Operation was explicitly assigned to /GPU:1 but available devices are

My system has 2 Nvidia GPUs, a GTX 750 Ti which is assigned to the OS for OS graphics and a GTX 1080Ti which is free to be used for Tensorflow. I use the call: tensorflow::graph::SetDefaultDevice("/GPU:1", &graph); to enable GPU1 .
I am running TF 1.10 hand compiled and configured for C++/CMake and Cuda compilation tools, release 9.1, V9.1.85. When I assign GPU 1 to execute my graph I get the following error:
"Invalid argument: Cannot assign a device for operation 'h1_w/read':
Operation was explicitly assigned to /GPU:1 but available devices are
[ /job:localhost/replica:0/task:0/device:CPU:0,
/job:localhost/replica:0/task:0/device:GPU:0 ]. Make sure the device
specification refers to a valid device. [[{{node h1_w/read}} =
IdentityT=DT_FLOAT, _class=["loc:#h1_w"], _device="/GPU:1"]]"