I'm running darknet on colab and getting this error after 20 epochs.
CUDA Error: resource already mapped
darknet: ./src/cuda.c:36: check_error: Assertion `0' failed.
Has anyone faced this issue? Is it a memory issue? I saw other errors of CUDA running out of memory on GitHub but this wasn't there.
Related
When I trained my deep Learning model on Google Colab on Nov 4th 2021, I had no issues, The model was trained in half an hour using GPU instance and then default Tensorflow on Google Colab was 2.6. Now the same code is not working after the default tensorflow version is upgraded to 2.7 in Google Colab. I'm getting OOM error and the my data which is in the shape (16,1024,1024,1) is getting transformed to (16,64,1024,1024). This is not happening on my local (my laptop doesnt have GPU, so its taking very long time to run the same program).
When I tried to downgrade the Tensorflow version on my google
Duplicate of # colab session its giving me CuDnn version incompatibility error.
I want to know if anyone is facing similar issue, how can we rectify it. I have deadlines to meet..
ResourceExhaustedError: 2 root error(s) found.
(0) RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[16,64,1024,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node model/concatenate/concat
(defined at /usr/local/lib/python3.7/dist-packages/keras/backend.py:3224)
]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
My source code is taken from https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/vision/ipynb/zero_dce.ipynb
You can try running this code again using gpu in colab:
Edit - Notebook Setting - GPU - Save
As it did not show any error and ran successfully when I executed the same code in Google colab - gpu with TensorFlow version 2.7.0.
Finally I found the issue, it's due to size of my input image, not due to default tensorflow version upgrade. If I resize my input image to the size 512x512 my code is working fine and I'm not getting above issue
when I run this tutorial https://www.tensorflow.org/lite/tutorials/model_maker_object_detection
on a GPU I get the following error: "Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above."
I do not have any warning log messages printed above the error message.
No issues if I run on a CPU.
I do not make any changes to the tutorial before execution.
Searching online I found that the problem may disappear if I reset the runtime. It does not work for me.
Searching online I found that the problem may be with the version of cuDNN. Before spending time to change the cuDNN version, I would like to know your opinions.
Please let me know if you need further information.
Thanks,
Federico
when I run my code on colab gpu today, I got A100-SXM4-40GB and an error which is 'CUDA error: no kernel image is available for execution on the device'. I ran the exactly same code on colab gpu last week, there was no such error. I also try to run my code on cpu, it has no issue. Could you please let me know what happens? Thanks.
Please see https://github.com/googlecolab/colabtools/issues/2287 for a workaround (and subscribe to notified of eventual resolution).
I'm using theano gpu on Google Colab and I get this error:
ERROR (theano.gpuarray): pygpu was configured but could not be imported or is too old (version 0.7 or higher required)
I've set:
import os
os.environ["THEANO_FLAGS"]="device=cuda, floatX=float32"
And changed the runtime type to hardware accelerator 'GPU'
Could you please help me solving this issue?
Thank you.
I am using Keras (Theano backend) with GPU and Cuda 8.0. Everything works fine when I run my code in Jupyter or Ubuntu terminal. However, inside Eclipse (PyDev) I receive the following error importing Keras:
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: libcublas.so.8.0: cannot open shared object file: No such file or directory
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu0 is not available (error: cuda unavailable)
I have double checked the interpreter and it is the same python as the terminal and Jupyter. I have also added the /usr/local/cuda/lib64/ to the pythonpath of the interpreter but still the same error !
Anybody knows how to fix the issue with PyDev?
Thank you,
I found a solution but not the reason.
I started Eclipse from Ubuntu terminal and it worked fine. I don't know why it couldn't find CUDA path when I start it by double clicking on its icon.