Now, in tensorflow2.0 a lot of things have changed.
I need to know if is there any way to clean the GPU memory whenever I want.
Call tf.keras.backend.clear_session()
Related
I am training object detection pipeline which is developed using TensorFlow library. My problem is, even after stopping the script memory usage is really high and not going down. Can somebody recommend a remedy to this problem?
I am using TensorFlow=2.6 and the object detection API from tensorflow to train on my data.
Even after I re-ran my script (model_main_tf2) after stopping the older ones, these older ones are still consuming a lot of memory (with same name as model_main_tf2) as can be seen in the figure below.
try running:
tf.keras.backend.clear_session()
after you are finished running a model
Is there any way to prevent full GPU memory allocation for MXNet? So that it only allocates what it needs and not the whole GPU memory.
I want to use another model in Tensorflow/Keras on the same GPU alongside MXNet and it seems that the whole memory gets reserved by MXNet.
MXNet allocates memory as needed. Perhaps there is a memory leak in your program or Tensorflow is trying to pre-allocate the memory on the entire GPU which is the default behavior. That behavior is configurable with tf.GPUOptions. See the links on how to use those options.
Hope that helps,
Vishaal
I have the issue that my GPU memory is not released after closing a tensorflow session in Python. These three line suffice to cause the problem:
import tensorflow as tf
sess=tf.Session()
sess.close()
After the third line the memory is not released. I have been up and down many forums and tried all sorts of suggestions, but nothing has worked for me. For details please also see my comment at the bottom here:
https://github.com/tensorflow/tensorflow/issues/19731
Here I have documented the ways in which I mange to kill the process and thus release the memory, but this is not useful for long-running and automated processes. I would very much appreciate any further suggestions to try. I am using Windows.
EDIT: I have now found a solution that at least allows me to do what I am trying to do. I am still NOT able to release the memory, but I am able to 'reuse' it. The code has this structure:
import tensorflow as tf
from keras import backend as K
cfg=K.tf.ConfigProto()
#cfg.gpu_options.allow_growth=True #this is optional
cfg.gpu_options.per_process_gpu_memory_fraction = 0.8 #you can use any percentage here
#upload your data and define your model (2 layers in this case) here
for i in range(len(neuron1)):
for j in range(len(neuron2)):
K.set_session(K.tf.Session(config=cfg))
#train your NN for i,j
The first time the script enters the loop the GPU memory is still allocated (80% in the above example) and thus cluttered, however this code nonetheless seems to reuse the same memory somehow. I reckon the K.set_session(K.tf.Session(config=cfg)) somehow destorys or resets the old session allowing the memory to be 'reused' within this context at least. Note that I am not using sess.close() or K.clear_session() or resetting the default graph explicitly. This still does not work for me. When done with the loops the GPU memory is still full.
Refer to this discussion. You can reuse your allocated memory but if you want to free the memory, then you would have to exit the Python interpreter itself.
If I'm understanding correctly, it should be as simple as:
from numba import cuda
cuda.close()
Solved this problem myself. It was because there were too much images in the celeba dataset and my dataloader was so inefficient. The dataloading took too much time and caused the low speed.
But still, this could not explain why the code was running on the cpu while the gpu memory was also taken up. After all I just transfer to pytorch.
My environment: windows10, cuda 9.0, cudnn 7.0.5, tensorflow-gpu 1.8.0.
I am working a cyclegan model. At first, it worked fine with my toy dataset, and could run on gpu without main problem(though the first 10 iterations took extremely long time, which means it might be running on cpu).
I later tried celeba dataset, only changed the folder name to load the data(I loaded data to the memory all at once, then use my own next_batch function and feed_dict to train the model). Then the problem arose: the GPU memory was still taken according to GPU-Z, but the GPU-load is low(less than 10%), and the training speed is very slow(took more than 10 times than normal), which means the code was running on CPU.
Would anyone please give me some advise? Any help is appreciated, thanks.
What is the batch size that you were trying? If it's too low (something like 2-8) for a small model, the memory consumed will not be much. It all depends on your batch size, the number of parameters in your model, etc. It also depends on the model architecture and how much of the model has components that can be run in parallel. Maybe try increasing your batch size and re-running it?
I am currently building a 3d convolutional network, for video classification. The main problem is that I run out of memory too easily. Even if I set my batch_size to 1, there is still not enough memory to train my CNN the way I want.
I am using a GTX 970 with 4Gb of VRAM (3.2Gb free to use by tensorflow). I was expecting it to still train my network, maybe using my RAM memory as a backup, or doing the calculations in parts. But until now I could only run it making the CNN simpler, which affects performance directly.
I think I can run on CPU, but it is significantly slower, making it not a good solution either.
Is there a better solution than to buy a better GPU?
Thanks in advance.
Using gradient checkpointing will help with memory limits.