I want to specify the gpu to run my process. And I set it as follows:
import tensorflow as tf
with tf.device('/gpu:0'):
a = tf.constant(3.0)
with tf.Session() as sess:
while True:
print sess.run(a)
However it still allocate memory in both my two gpus.
| 0 7479 C python 5437MiB
| 1 7479 C python 5437MiB
There are 3 ways to achieve this:
Using CUDA_VISIBLE_DEVICES environment variable.
by setting environment variable CUDA_VISIBLE_DEVICES="1" makes only device 1 visible and by setting CUDA_VISIBLE_DEVICES="0,1" makes devices 0 and 1 visible. You can do this in python by having a line os.environ["CUDA_VISIBLE_DEVICES"]="0,1" after importing os package.
Using with tf.device('/gpu:2') and creating the graph. Then it will use GPU device 2 to run.
Using config = tf.ConfigProto(device_count = {'GPU': 1}) and then sess = tf.Session(config=config). This will use GPU device 1.
TF would allocate all available memory on each visible GPU if not told otherwise. Here are 5 ways to stick to just one (or a few) GPUs.
Bash solution. Set CUDA_VISIBLE_DEVICES=0,1 in your terminal/console before starting python or jupyter notebook:
CUDA_VISIBLE_DEVICES=0,1 python script.py
Python solution. run next 2 lines of code before constructing a session
import os
os.environ["CUDA_VISIBLE_DEVICES"]="0,1"
Automated solution. Method below will automatically detect GPU devices that are not used by other scripts and set CUDA_VISIBLE_DEVICES for you. You have to call mask_unused_gpus before constructing a session. It will filter out GPUs by current memory usage. This way you can run multiple instances of your script at once without changing your code or setting console parameters.
The function:
import subprocess as sp
import os
def mask_unused_gpus(leave_unmasked=1):
ACCEPTABLE_AVAILABLE_MEMORY = 1024
COMMAND = "nvidia-smi --query-gpu=memory.free --format=csv"
try:
_output_to_list = lambda x: x.decode('ascii').split('\n')[:-1]
memory_free_info = _output_to_list(sp.check_output(COMMAND.split()))[1:]
memory_free_values = [int(x.split()[0]) for i, x in enumerate(memory_free_info)]
available_gpus = [i for i, x in enumerate(memory_free_values) if x > ACCEPTABLE_AVAILABLE_MEMORY]
if len(available_gpus) < leave_unmasked: raise ValueError('Found only %d usable GPUs in the system' % len(available_gpus))
os.environ["CUDA_VISIBLE_DEVICES"] = ','.join(map(str, available_gpus[:leave_unmasked]))
except Exception as e:
print('"nvidia-smi" is probably not installed. GPUs are not masked', e)
mask_unused_gpus(2)
Limitations: if you start multiple scripts at once it might cause a collision, because memory is not allocated immediately when you construct a session. In case it is a problem for you, you can use a randomized version as in original source code: mask_busy_gpus()
Tensorflow 2.0 suggest yet another method:
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
# Restrict TensorFlow to only use the first GPU
try:
tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
except RuntimeError as e:
# Visible devices must be set at program startup
print(e)
Tensorflow/Keras also allows to specify gpu to be used with session config. I can recommend it only if setting environment variable is not an options (i.e. an MPI run). Because it tend to be the least reliable of all methods, especially with keras.
config = tf.ConfigProto()
config.gpu_options.visible_device_list = "0,1"
with tf.Session(config) as sess:
#or K.set_session(tf.Session(config))
I believe that you need to set CUDA_VISIBLE_DEVICES=1. Or which ever GPU you want to use. If you make only one GPU visible, you will refer to it as /gpu:0 in tensorflow regardless of what you set the environment variable to.
More info on that environment variable: https://devblogs.nvidia.com/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/
You can modify the GPU options settings by adding at the begining of your python script:
gpu_options = tf.GPUOptions(visible_device_list="0")
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
"0" is here the name of the GPU you want to use. You can have the list of the GPU available by typing the command nvidia-smi in the terminal prompt.
With Keras, these 2 functions allow the selection of CPU or GPU and in the case of GPU the fraction of memory that will be used.
import os
from keras.backend.tensorflow_backend import set_session
import tensorflow as tf
def set_cpu_option():
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"] = ""
os.environ["CUDA_VISIBLE_DEVICES"] = ""
def set_gpu_option(which_gpu, fraction_memory):
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = fraction_memory
config.gpu_options.visible_device_list = which_gpu
set_session(tf.Session(config=config))
return
set_gpu_option("0", 0.9)
# or
set_cpu_option()
def set_specific_gpu(ID):
gpus_all_physical_list = tf.config.list_physical_devices(device_type='GPU')
tf.config.set_visible_devices(gpus_all_physical_list[ID], 'GPU')
refer to https://www.tensorflow.org/api_docs/python/tf/config/list_physical_devices
import tensorflow as tf
gpu_number = 2 #### GPU number
gpus = tf.config.list_physical_devices('GPU')
if gpus:
tf.config.experimental.set_visible_devices(gpus[gpu_number], 'GPU')
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPU")
The most elegant and clean way I have seen this work for me on my multi-core gpu setup is:
import os
os.environ["CUDA_VISIBLE_DEVICES"]="1"
tf_device='/gpu:0'
This assigns the task to gpu device 1.
Similarly, doing something on the lines:
import os
os.environ["CUDA_VISIBLE_DEVICES"]="2"
tf_device='/gpu:0'
The os.environ command can be seen as a way of making only that GPU device exposed on which you intend to run the code. The second command just picks the first of the available devices that you specified.
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="3"
The only thing which worked for me cleanly from within Processes to assign specific GPU to each process in a Pool.
TF 2.9 and above version of tensorflow has changed the APIs, so updating for the same,
gpus = tf.config.list_physical_devices('GPU')
gpu_id = 0
if gpus:
# Restrict TensorFlow to only use only one GPU based on gpu_id
try:
tf.config.set_visible_devices(gpus[gpu_id], 'GPU')
logical_gpus = tf.config.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPU")
except RuntimeError as e:
# Visible devices must be set before GPUs have been initialized
print(e)
Related
I work in an environment in which computational resources are shared, i.e., we have a few server machines equipped with a few Nvidia Titan X GPUs each.
For small to moderate size models, the 12 GB of the Titan X is usually enough for 2–3 people to run training concurrently on the same GPU. If the models are small enough that a single model does not take full advantage of all the computational units of the GPU, this can actually result in a speedup compared with running one training process after the other. Even in cases where the concurrent access to the GPU does slow down the individual training time, it is still nice to have the flexibility of having multiple users simultaneously train on the GPU.
The problem with TensorFlow is that, by default, it allocates the full amount of available GPU memory when it is launched. Even for a small two-layer neural network, I see that all 12 GB of the GPU memory is used up.
Is there a way to make TensorFlow only allocate, say, 4 GB of GPU memory, if one knows that this is enough for a given model?
You can set the fraction of GPU memory to be allocated when you construct a tf.Session by passing a tf.GPUOptions as part of the optional config argument:
# Assume that you have 12GB of GPU memory and want to allocate ~4GB:
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
The per_process_gpu_memory_fraction acts as a hard upper bound on the amount of GPU memory that will be used by the process on each GPU on the same machine. Currently, this fraction is applied uniformly to all of the GPUs on the same machine; there is no way to set this on a per-GPU basis.
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)
https://github.com/tensorflow/tensorflow/issues/1578
For TensorFlow 2.0 and 2.1 (docs):
import tensorflow as tf
tf.config.gpu.set_per_process_memory_growth(True)
For TensorFlow 2.2+ (docs):
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
The docs also list some more methods:
Set environment variable TF_FORCE_GPU_ALLOW_GROWTH to true.
Use tf.config.experimental.set_virtual_device_configuration to set a hard limit on a Virtual GPU device.
Here is an excerpt from the Book Deep Learning with TensorFlow
In some cases it is desirable for the process to only allocate a subset of the available memory, or to only grow the memory usage as it is needed by the process. TensorFlow provides two configuration options on the session to control this. The first is the allow_growth option, which attempts to allocate only as much GPU memory based on runtime allocations, it starts out allocating very little memory, and as sessions get run and more GPU memory is needed, we extend the GPU memory region needed by the TensorFlow process.
1) Allow growth: (more flexible)
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)
The second method is per_process_gpu_memory_fraction option, which determines the fraction of the overall amount of memory that each visible GPU should be allocated. Note: No release of memory needed, it can even worsen memory fragmentation when done.
2) Allocate fixed memory:
To only allocate 40% of the total memory of each GPU by:
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, ...)
Note:
That's only useful though if you truly want to bind the amount of GPU memory available on the TensorFlow process.
For Tensorflow version 2.0 and 2.1 use the following snippet:
import tensorflow as tf
gpu_devices = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpu_devices[0], True)
For prior versions , following snippet used to work for me:
import tensorflow as tf
tf_config=tf.ConfigProto()
tf_config.gpu_options.allow_growth=True
sess = tf.Session(config=tf_config)
All the answers above assume execution with a sess.run() call, which is becoming the exception rather than the rule in recent versions of TensorFlow.
When using the tf.Estimator framework (TensorFlow 1.4 and above) the way to pass the fraction along to the implicitly created MonitoredTrainingSession is,
opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
trainingConfig = tf.estimator.RunConfig(session_config=conf, ...)
tf.estimator.Estimator(model_fn=...,
config=trainingConfig)
Similarly in Eager mode (TensorFlow 1.5 and above),
opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
tfe.enable_eager_execution(config=conf)
Edit: 11-04-2018
As an example, if you are to use tf.contrib.gan.train, then you can use something similar to bellow:
tf.contrib.gan.gan_train(........, config=conf)
You can use
TF_FORCE_GPU_ALLOW_GROWTH=true
in your environment variables.
In tensorflow code:
bool GPUBFCAllocator::GetAllowGrowthValue(const GPUOptions& gpu_options) {
const char* force_allow_growth_string =
std::getenv("TF_FORCE_GPU_ALLOW_GROWTH");
if (force_allow_growth_string == nullptr) {
return gpu_options.allow_growth();
}
Tensorflow 2.0 Beta and (probably) beyond
The API changed again. It can be now found in:
tf.config.experimental.set_memory_growth(
device,
enable
)
Aliases:
tf.compat.v1.config.experimental.set_memory_growth
tf.compat.v2.config.experimental.set_memory_growth
References:
https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/config/experimental/set_memory_growth
https://www.tensorflow.org/guide/gpu#limiting_gpu_memory_growth
See also:
Tensorflow - Use a GPU: https://www.tensorflow.org/guide/gpu
for Tensorflow 2.0 Alpha see: this answer
All the answers above refer to either setting the memory to a certain extent in TensorFlow 1.X versions or to allow memory growth in TensorFlow 2.X.
The method tf.config.experimental.set_memory_growth indeed works for allowing dynamic growth during the allocation/preprocessing. Nevertheless one may like to allocate from the start a specific-upper limit GPU memory.
The logic behind allocating a specific GPU memory would also be to prevent OOM memory during training sessions. For example, if one trains while opening video-memory consuming Chrome tabs/any other video consumption process, the tf.config.experimental.set_memory_growth(gpu, True) could result in OOM errors thrown, hence the necessity of allocating from the start more memory in certain cases.
The recommended and correct way in which to allot memory per GPU in TensorFlow 2.X is done in the following manner:
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
# Restrict TensorFlow to only allocate 1GB of memory on the first GPU
try:
tf.config.experimental.set_virtual_device_configuration(
gpus[0],
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)]
Shameless plug: If you install the GPU supported Tensorflow, the session will first allocate all GPUs whether you set it to use only CPU or GPU. I may add my tip that even you set the graph to use CPU only you should set the same configuration(as answered above:) ) to prevent the unwanted GPU occupation.
And in an interactive interface like IPython and Jupyter, you should also set that configure, otherwise, it will allocate all memory and leave almost none for others. This is sometimes hard to notice.
If you're using Tensorflow 2 try the following:
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.Session(config=config)
For Tensorflow 2.0 this this solution worked for me. (TF-GPU 2.0, Windows 10, GeForce RTX 2070)
physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
tf.config.experimental.set_memory_growth(physical_devices[0], True)
# allocate 60% of GPU memory
from keras.backend.tensorflow_backend import set_session
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.6
set_session(tf.Session(config=config))
this code has worked for me:
import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.InteractiveSession(config=config)
Well I am new to tensorflow, I have Geforce 740m or something GPU with 2GB ram, I was running mnist handwritten kind of example for a native language with training data containing of 38700 images and 4300 testing images and was trying to get precision , recall , F1 using following code as sklearn was not giving me precise reults. once i added this to my existing code i started getting GPU errors.
TP = tf.count_nonzero(predicted * actual)
TN = tf.count_nonzero((predicted - 1) * (actual - 1))
FP = tf.count_nonzero(predicted * (actual - 1))
FN = tf.count_nonzero((predicted - 1) * actual)
prec = TP / (TP + FP)
recall = TP / (TP + FN)
f1 = 2 * prec * recall / (prec + recall)
plus my model was heavy i guess, i was getting memory error after 147, 148 epochs, and then I thought why not create functions for the tasks so I dont know if it works this way in tensrorflow, but I thought if a local variable is used and when out of scope it may release memory and i defined the above elements for training and testing in modules, I was able to achieve 10000 epochs without any issues, I hope this will help..
i tried to train unet on voc data set but because of huge image size, memory finishes. i tried all the above tips, even tried with batch size==1, yet to no improvement. sometimes TensorFlow version also causes the memory issues. try by using
pip install tensorflow-gpu==1.8.0
I am currently trying to run a text-based sequence to sequence model using tensorflow 2.6 and CuDNN.
The code is running, but taking suspiciously long. When I check my Task Manager, I see the following:
This looks weird to me, because all memory is taking but it's not under heavy load.
Is this expected behaviour?
System:
Windows 10
Python 3.9.9
Tensorflow & Keras 2.6
CUDA 11.6
CuDNN 8.3
NVidia RTX 3080ti
In the code I found the following settings for the GPU
def get_gpu_config():
gconfig = tf.compat.v1.ConfigProto()
gconfig.gpu_options.per_process_gpu_memory_fraction = 0.975 # Don't take 100% of the memory
gconfig.allow_soft_placement = True # Does not aggressively take all the GPU memory
gconfig.gpu_options.allow_growth = True # Take more memory when necessary
return gconfig
My python output tells me it found my graphics card:
And it is also visible in my nvidia-smi output:
Am I maybe missing a configuration?
The times it takes are similar to what I got on a CPU system, which seems off to me.
Sidenote:
The Code I try to run had to be migrated from tensorflow-gpu 1.12, but that went "relatively" smooth.
Yes this behaviour is normal for TensorFlow!
From the TensorFlow docs
By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to CUDA_VISIBLE_DEVICES) visible to the process. This is done to more efficiently use the relatively precious GPU memory resources on the devices by reducing memory fragmentation. To limit TensorFlow to a specific set of GPUs, use the tf.config.set_visible_devices method.
If you don't want TensorFlow to allocate the totality of your VRAM, you can either set a hard limit on how much memory to use or tell TensorFlow to only allocate as much memory as needed.
To set a hard limit
Configure a virtual GPU device as follows:
gpus = tf.config.list_physical_devices('GPU')
if gpus:
# Restrict TensorFlow to only allocate 1GB of memory on the first GPU
try:
tf.config.set_logical_device_configuration(
gpus[0],
[tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
logical_gpus = tf.config.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Virtual devices must be set before GPUs have been initialized
print(e)
Only use as much as needed
You can set the environment variable TF_FORCE_GPU_ALLOW_GROWTH=true
OR
Use tf.config.experimental.set_memory_growth as follows:
gpus = tf.config.list_physical_devices('GPU')
if gpus:
try:
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized
print(e)
All code and information here is taken from https://www.tensorflow.org/guide/gpu#limiting_gpu_memory_growth
I use Tensorflow version 2.0 and will like to configure the GPU's with it.
for Tensorflow 1.x, it was done in following way
# GPU configuration
from keras.backend.tensorflow_backend import set_session
import keras
configtf = tf.compat.v1.ConfigProto()
configtf.gpu_options.allow_growth = True
configtf.gpu_options.visible_device_list = "0"
sess = tf.compat.v1.Session(config=configtf)
set_session(sess)
However, set_session is not longer available in Tensorflow 2.0, so to use access GPU's, I tried following this guide. Both the codes below lead to empty list of available GPUs, which means tensorflow is not using them.
gpus = tf.config.experimental.list_physical_devices("GPU")
gpus
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
logical_gpus
I do have Tesla K80 access available.
What will be the right way to configure tf for it to use the available GPUs? Any help will be appreciated.
What worked during this situation was to export the available GPU's into the conda environment using following command.
(your_environment) user#machine: export CUDA_VISIBLE_DEVICES=0
I work in an environment in which computational resources are shared, i.e., we have a few server machines equipped with a few Nvidia Titan X GPUs each.
For small to moderate size models, the 12 GB of the Titan X is usually enough for 2–3 people to run training concurrently on the same GPU. If the models are small enough that a single model does not take full advantage of all the computational units of the GPU, this can actually result in a speedup compared with running one training process after the other. Even in cases where the concurrent access to the GPU does slow down the individual training time, it is still nice to have the flexibility of having multiple users simultaneously train on the GPU.
The problem with TensorFlow is that, by default, it allocates the full amount of available GPU memory when it is launched. Even for a small two-layer neural network, I see that all 12 GB of the GPU memory is used up.
Is there a way to make TensorFlow only allocate, say, 4 GB of GPU memory, if one knows that this is enough for a given model?
You can set the fraction of GPU memory to be allocated when you construct a tf.Session by passing a tf.GPUOptions as part of the optional config argument:
# Assume that you have 12GB of GPU memory and want to allocate ~4GB:
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
The per_process_gpu_memory_fraction acts as a hard upper bound on the amount of GPU memory that will be used by the process on each GPU on the same machine. Currently, this fraction is applied uniformly to all of the GPUs on the same machine; there is no way to set this on a per-GPU basis.
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)
https://github.com/tensorflow/tensorflow/issues/1578
For TensorFlow 2.0 and 2.1 (docs):
import tensorflow as tf
tf.config.gpu.set_per_process_memory_growth(True)
For TensorFlow 2.2+ (docs):
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
The docs also list some more methods:
Set environment variable TF_FORCE_GPU_ALLOW_GROWTH to true.
Use tf.config.experimental.set_virtual_device_configuration to set a hard limit on a Virtual GPU device.
Here is an excerpt from the Book Deep Learning with TensorFlow
In some cases it is desirable for the process to only allocate a subset of the available memory, or to only grow the memory usage as it is needed by the process. TensorFlow provides two configuration options on the session to control this. The first is the allow_growth option, which attempts to allocate only as much GPU memory based on runtime allocations, it starts out allocating very little memory, and as sessions get run and more GPU memory is needed, we extend the GPU memory region needed by the TensorFlow process.
1) Allow growth: (more flexible)
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)
The second method is per_process_gpu_memory_fraction option, which determines the fraction of the overall amount of memory that each visible GPU should be allocated. Note: No release of memory needed, it can even worsen memory fragmentation when done.
2) Allocate fixed memory:
To only allocate 40% of the total memory of each GPU by:
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, ...)
Note:
That's only useful though if you truly want to bind the amount of GPU memory available on the TensorFlow process.
For Tensorflow version 2.0 and 2.1 use the following snippet:
import tensorflow as tf
gpu_devices = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpu_devices[0], True)
For prior versions , following snippet used to work for me:
import tensorflow as tf
tf_config=tf.ConfigProto()
tf_config.gpu_options.allow_growth=True
sess = tf.Session(config=tf_config)
All the answers above assume execution with a sess.run() call, which is becoming the exception rather than the rule in recent versions of TensorFlow.
When using the tf.Estimator framework (TensorFlow 1.4 and above) the way to pass the fraction along to the implicitly created MonitoredTrainingSession is,
opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
trainingConfig = tf.estimator.RunConfig(session_config=conf, ...)
tf.estimator.Estimator(model_fn=...,
config=trainingConfig)
Similarly in Eager mode (TensorFlow 1.5 and above),
opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
tfe.enable_eager_execution(config=conf)
Edit: 11-04-2018
As an example, if you are to use tf.contrib.gan.train, then you can use something similar to bellow:
tf.contrib.gan.gan_train(........, config=conf)
You can use
TF_FORCE_GPU_ALLOW_GROWTH=true
in your environment variables.
In tensorflow code:
bool GPUBFCAllocator::GetAllowGrowthValue(const GPUOptions& gpu_options) {
const char* force_allow_growth_string =
std::getenv("TF_FORCE_GPU_ALLOW_GROWTH");
if (force_allow_growth_string == nullptr) {
return gpu_options.allow_growth();
}
Tensorflow 2.0 Beta and (probably) beyond
The API changed again. It can be now found in:
tf.config.experimental.set_memory_growth(
device,
enable
)
Aliases:
tf.compat.v1.config.experimental.set_memory_growth
tf.compat.v2.config.experimental.set_memory_growth
References:
https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/config/experimental/set_memory_growth
https://www.tensorflow.org/guide/gpu#limiting_gpu_memory_growth
See also:
Tensorflow - Use a GPU: https://www.tensorflow.org/guide/gpu
for Tensorflow 2.0 Alpha see: this answer
All the answers above refer to either setting the memory to a certain extent in TensorFlow 1.X versions or to allow memory growth in TensorFlow 2.X.
The method tf.config.experimental.set_memory_growth indeed works for allowing dynamic growth during the allocation/preprocessing. Nevertheless one may like to allocate from the start a specific-upper limit GPU memory.
The logic behind allocating a specific GPU memory would also be to prevent OOM memory during training sessions. For example, if one trains while opening video-memory consuming Chrome tabs/any other video consumption process, the tf.config.experimental.set_memory_growth(gpu, True) could result in OOM errors thrown, hence the necessity of allocating from the start more memory in certain cases.
The recommended and correct way in which to allot memory per GPU in TensorFlow 2.X is done in the following manner:
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
# Restrict TensorFlow to only allocate 1GB of memory on the first GPU
try:
tf.config.experimental.set_virtual_device_configuration(
gpus[0],
[tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)]
Shameless plug: If you install the GPU supported Tensorflow, the session will first allocate all GPUs whether you set it to use only CPU or GPU. I may add my tip that even you set the graph to use CPU only you should set the same configuration(as answered above:) ) to prevent the unwanted GPU occupation.
And in an interactive interface like IPython and Jupyter, you should also set that configure, otherwise, it will allocate all memory and leave almost none for others. This is sometimes hard to notice.
If you're using Tensorflow 2 try the following:
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.Session(config=config)
For Tensorflow 2.0 this this solution worked for me. (TF-GPU 2.0, Windows 10, GeForce RTX 2070)
physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
tf.config.experimental.set_memory_growth(physical_devices[0], True)
# allocate 60% of GPU memory
from keras.backend.tensorflow_backend import set_session
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.6
set_session(tf.Session(config=config))
this code has worked for me:
import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.InteractiveSession(config=config)
Well I am new to tensorflow, I have Geforce 740m or something GPU with 2GB ram, I was running mnist handwritten kind of example for a native language with training data containing of 38700 images and 4300 testing images and was trying to get precision , recall , F1 using following code as sklearn was not giving me precise reults. once i added this to my existing code i started getting GPU errors.
TP = tf.count_nonzero(predicted * actual)
TN = tf.count_nonzero((predicted - 1) * (actual - 1))
FP = tf.count_nonzero(predicted * (actual - 1))
FN = tf.count_nonzero((predicted - 1) * actual)
prec = TP / (TP + FP)
recall = TP / (TP + FN)
f1 = 2 * prec * recall / (prec + recall)
plus my model was heavy i guess, i was getting memory error after 147, 148 epochs, and then I thought why not create functions for the tasks so I dont know if it works this way in tensrorflow, but I thought if a local variable is used and when out of scope it may release memory and i defined the above elements for training and testing in modules, I was able to achieve 10000 epochs without any issues, I hope this will help..
i tried to train unet on voc data set but because of huge image size, memory finishes. i tried all the above tips, even tried with batch size==1, yet to no improvement. sometimes TensorFlow version also causes the memory issues. try by using
pip install tensorflow-gpu==1.8.0
I am fitting a model in a for loop, but I am getting an error that my GPU's memory is full. I am using Keras in Anaconda Spyder IDE. My GPU is a Asus GTX 1060 6gb.
I have also used codes like: K.clear_session(), gc.collect(), tf.reset_default_graph(), del custom_model but none of them worked.
GPU properties say's 98% of memory is full:
Nothing flush GPU memory except numba.cuda.close() but will not allow me to use my GPU again. The only way to clear it is restarting kernel and rerun my code.
I am looking for any script code to add my code allow me to use my code in for loop and clear GPU in every loop.
Wrap up the model creation and training part in a function then use subprocess for the main work. When training is done, subprocess will be terminated and GPU memory will be free.
something like:
import multiprocessing
def create_model_and_train( ):
.....
.....
p = multiprocessing.Process(target=create_model_and_train)
p.start()
p.join()
Or you can create below function and call it before each run:
from keras.backend.tensorflow_backend import set_session
from keras.backend.tensorflow_backend import clear_session
from keras.backend.tensorflow_backend import get_session
import tensorflow
import gc
# Reset Keras Session
def reset_keras():
sess = get_session()
clear_session()
sess.close()
sess = get_session()
try:
del classifier # this is from global space - change this as you need
except:
pass
print(gc.collect()) # if it does something you should see a number as output
# use the same config as you used to create the session
config = tensorflow.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 1
config.gpu_options.visible_device_list = "0"
set_session(tensorflow.Session(config=config))