I am trying to fit the model into memory.
I saw that one of the possible solutions is:
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
or this aporoach:
#config.gpu_options.per_process_gpu_memory_fraction = 0.4 #Allocate fixed memory
And finally
set_session(tf.Session(config=config))
I wonder where do I supposed to put this?
Related
I'm trying to load a pretrained retinanet model with keras by running:
# import keras
import keras
# import keras_retinanet
from keras_retinanet import models
from keras_retinanet.utils.image import read_image_bgr, preprocess_image, resize_image
from keras_retinanet.utils.visualization import draw_box, draw_caption
from keras_retinanet.utils.colors import label_color
# set tf backend to allow memory to grow, instead of claiming everything
import tensorflow as tf
def get_session():
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
return tf.Session(config=config)
model_path = os.path.join('sample_data/snapshots', sorted(os.listdir('sample_data/snapshots'), reverse=True)[0])
print(model_path)
# load retinanet model
model = models.load_model(model_path, backbone_name='resnet50')
model = models.convert_model(model)
I am facing the following error with both codes:
OSError: SavedModel file does not exist at: sample_data/snapshots/training_5000(640_480).h5/{saved_model.pbtxt|saved_model.pb}
the cause might be some new versions of Keras or tensorflow,
soo I am going to list the versions that I am currently using.
keras.__version__
2.4.3
tf.__version__
2.4.1
Note: I am trying to run this code in my Colab.
I need to perform a job that averages large numbers of long vectors multiple times, and I would like this to be done on my GPU.
Monitoring nvtop and htop while running, I see that GPU (which always shows top activity when I train Keras models) is not being used at all in these operations, while CPU-use surges during these operations.
I have simulated it in the code snippet below (trying to minimize non-tf-work).
what am I doing wrong?
import tensorflow as tf
from tensorflow.math import add_n, add, scalar_mul
import numpy as np
tf.debugging.set_log_device_placement(True)
sess = tf.compat.v1.Session(config=config)
tf.compat.v1.keras.backend.set_session(sess)
os.environ["CUDA_VISIBLE_DEVICES"]="1"
#Make a random numpy matrix
vecs=np.random.rand(100, 300)
with sess.as_default():
with tf.device('/GPU:0'):
for _ in range(1000):
#vecs=np.random.rand(100, 300)
tf_vecs=tf.Variable(vecs, dtype=tf.float64)
tf_invlgt=tf.Variable(1/np.shape(vecs)[0],dtype=tf.float64)
vectors=tf.unstack(tf_vecs)
sum_vecs=add_n(vectors)
mean_vec=tf.Variable(scalar_mul(tf_invlgt, sum_vecs))
Thanks
Michael
I might be wrong but could it be that the cuda_visible_devices should be "0" like
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0"
see github comment here
If if still does not work, you can also add a small piece of code to check if tensorflow can see the gpu devices:
from tensorflow.python.client import device_lib
def get_available_gpus():
local_device_protos = device_lib.list_local_devices()
return [x.name for x in local_device_protos if x.device_type == 'GPU']
This is mentioned here
I would like to benchmark some TensorFlow operations (for example between them or against PyTorch). However most of the time I will write something like:
import numpy as np
import tensorflow as tf
tf_device = '/GPU:0'
a = np.random.normal(scale=100, size=shape).astype(np.int64)
b = np.array(7).astype(np.int64)
with tf.device(tf_device):
a_tf = tf.constant(a)
b_tf = tf.constant(b)
%timeit tf.math.floormod(a_tf, b_tf)
The problem with this approach is that it does the computation in eager-mode (I think in particular that it has to perform GPU to CPU placement). Eventually, I want to use those ops in a tf.keras model and therefore would like to evaluate their performance in graph mode.
What is the preferred way to do it?
My google searches have led to nothing and I don't know how to use sessions like in tf 1.x.
What you are looking for is tf.function. Check this tutorial and this docs.
As the tutorial says, in TensorFlow 2, eager execution is turned on by default. The user interface is intuitive and flexible (running one-off operations is much easier and faster), but this can come at the expense of performance and deployability. To get performant and portable models, use tf.function to make graphs out of your programs.
Check this code:
import numpy as np
import tensorflow as tf
import timeit
tf_device = '/GPU:0'
shape = [100000]
a = np.random.normal(scale=100, size=shape).astype(np.int64)
b = np.array(7).astype(np.int64)
#tf.function
def experiment(a_tf, b_tf):
tf.math.floormod(a_tf, b_tf)
with tf.device(tf_device):
a_tf = tf.constant(a)
b_tf = tf.constant(b)
# warm up
experiment(a_tf, b_tf)
print("In graph mode:", timeit.timeit(lambda: experiment(a_tf, b_tf), number=10))
print("In eager mode:", timeit.timeit(lambda: tf.math.floormod(a_tf, b_tf), number=10))
I use Tensorflow version 2.0 and will like to configure the GPU's with it.
for Tensorflow 1.x, it was done in following way
# GPU configuration
from keras.backend.tensorflow_backend import set_session
import keras
configtf = tf.compat.v1.ConfigProto()
configtf.gpu_options.allow_growth = True
configtf.gpu_options.visible_device_list = "0"
sess = tf.compat.v1.Session(config=configtf)
set_session(sess)
However, set_session is not longer available in Tensorflow 2.0, so to use access GPU's, I tried following this guide. Both the codes below lead to empty list of available GPUs, which means tensorflow is not using them.
gpus = tf.config.experimental.list_physical_devices("GPU")
gpus
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
logical_gpus
I do have Tesla K80 access available.
What will be the right way to configure tf for it to use the available GPUs? Any help will be appreciated.
What worked during this situation was to export the available GPU's into the conda environment using following command.
(your_environment) user#machine: export CUDA_VISIBLE_DEVICES=0
I have read answers like:
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.2
set_session(tf.Session(config=config))
But it just doesn't work. There seems to be so much update in both keras and TF that almost anything written in 2017 doesn't work!
So, how to limit memory usage?
One way to restrict reserving all GPU RAM in tensorflow is to grow the amount of reservation. This method will allow you to train multiple NN using same GPU but you cannot set a threshold on the amount of memory you want to reserve.
Using the following snippet before importing keras or just use tf.keras instead.
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
except RuntimeError as e:
print(e)
Seems that I have the same problem. Have you tried this?
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
set_session(sess)
This method will make the application allocate only as much GPU memory based on runtime allocation.
2022 update of #Yustina Ivanova's answer:
Most people will encounter errors such as (one of the following):
AttributeError: module 'tensorflow.keras.backend' has no attribute 'tensorflow_backend'
AttributeError: module 'tensorflow.keras.backend' has no attribute 'set_session'
AttributeError: module 'tensorflow' has no attribute 'ConfigProto'
AttributeError: module 'tensorflow' has no attribute 'Session'
The reason is compatibility with tf2. Assuming a keras backend, this would be the fix:
import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.compat.v1.Session(config=config)
tf.compat.v1.keras.backend.set_session(sess)