Error with Tutorial in Google Colab and GPU: "Failed to get convolution algorithm" - tensorflow

when I run this tutorial https://www.tensorflow.org/lite/tutorials/model_maker_object_detection
on a GPU I get the following error: "Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above."
I do not have any warning log messages printed above the error message.
No issues if I run on a CPU.
I do not make any changes to the tutorial before execution.
Searching online I found that the problem may disappear if I reset the runtime. It does not work for me.
Searching online I found that the problem may be with the version of cuDNN. Before spending time to change the cuDNN version, I would like to know your opinions.
Please let me know if you need further information.
Thanks,
Federico

Related

Google colab CUDA error: no kernel image is available for execution on the device

when I run my code on colab gpu today, I got A100-SXM4-40GB and an error which is 'CUDA error: no kernel image is available for execution on the device'. I ran the exactly same code on colab gpu last week, there was no such error. I also try to run my code on cpu, it has no issue. Could you please let me know what happens? Thanks.
Please see https://github.com/googlecolab/colabtools/issues/2287 for a workaround (and subscribe to notified of eventual resolution).

pyinstaller rejects numpy on Mac (M1, Big Sur & Intel Catalina)

pyinstaller have been working fine for me in this program until I began using numpy. The error message seems to be saying that I'm using the wrong version of numpy. I supposed that to mean that version I'm using isn't for the M1 chip. But I've also tried this on an Intel MacBook Air to no avail. Same error message there. And I looked at the material and the error message refers us to, and tried that too. Ie, I ddid a pip install of cython, etc. But.... still no go, and same error message. I'm making the program in Pycharm, and have tried this using both its venv environment and also the conda environment.
Both give the same result. Ideas anyone?
Here's the error message:
Please note and check the following:
The Python version is: Python3.9 from "/Users/abMain/zaad/zaad"
The NumPy version is: "1.21.2"
and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.
Original error was: dlopen(/var/folders/n1/26fjs9wx6hz5q9r5kdt5bn3w0000gp/T/_MEIJmdMCM/numpy/core/_multiarray_umath.cpython-39-darwin.so, 2): no suitable image found. Did find:
/var/folders/n1/26fjs9wx6hz5q9r5kdt5bn3w0000gp/T/_MEIJmdMCM/numpy/core/_multiarray_umath.cpython-39-darwin.so: mach-o, but wrong architecture
/private/var/folders/n1/26fjs9wx6hz5q9r5kdt5bn3w0000gp/T/_MEIJmdMCM/numpy/core/_multiarray_umath.cpython-39-darwin.so: mach-o, but wrong architecture

Colab Notebook crash/restart cycle, followed by autosave failures and "Invalid Credentials" 2 days after upgrading to Colab Pro

I was just in the middle of working on an RL model and the notebook started crashing and restarting. It was never able to stay successfully connected to a remote instance. Then it began failing to autosave for a little over 6 minutes.
It may also be worth noting that before this began, I was trying to pip uninstall/reinstall to upgrade my version of numpy.
It eventually threw the following error in a modal:
Notebook loading error
There was an error loading this notebook. Ensure that the file is accessible and try again.
Invalid Credentials
DETAILS
Invalid Credentials
Eb#https://colab.research.google.com/v2/external/external_polymer_binary_extended.js?vrz=colab-20200218-085600-RC00_295745119:57:573
Ix#https://colab.research.google.com/v2/external/external_polymer_binary_extended.js?vrz=colab-20200218-085600-RC00_295745119:867:76
SJ#https://colab.research.google.com/v2/external/external_polymer_binary_extended.js?vrz=colab-20200218-085600-RC00_295745119:1538:170
pda/<#https://colab.research.google.com/v2/external/external_polymer_binary_extended.js?vrz=colab-20200218-085600-RC00_295745119:1633:22
Ca#https://colab.research.google.com/v2/external/external_polymer_binary_extended.js?vrz=colab-20200218-085600-RC00_295745119:17:336
Aa.prototype.throw_#https://colab.research.google.com/v2/external/external_polymer_binary_extended.js?vrz=colab-20200218-085600-RC00_295745119:16:402
Ea/this.throw#https://colab.research.google.com/v2/external/external_polymer_binary_extended.js?vrz=colab-20200218-085600-RC00_295745119:18:248
c#https://colab.research.google.com/v2/external/external_polymer_binary_extended.js?vrz=colab-20200218-085600-RC00_295745119:26:299
Any and all help you may be able to provide is greatly appreciated!
Thank you

Deep Reinforcement Learning Hands on, chapter 7. Can't get tensorflow to work

Doing a course in Machine Learning and can't get Tensorboard to work. I have saved runs from running a DQN and I write:
tensorboard -logdir runs
With the folliwng result:
2019-12-28 18:32:04.265065: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
TensorBoard 1.7.0 at http://david-linux:6006 (Press CTRL+C to quit)
So I click the link and get:
No dashboards are active for the current data set.
Probable causes:
You haven’t written any data to your event files
TensorBoard can’t find your event files.
I also get this result after having the code running for a while:
"W1228 18:34:34.186506 Thread-2 application.py:272] path /[[_dataImageSrc]] not found, sending 404
W1228 18:34:34.205581 Thread-2 application.py:272] path /[[_imageURL]] not found, sending 404"
Running this on Linux using Anaconda Python version 3.6 because that is what the course book uses. Have no idea what the above errors means, quite new to coding in general and reinforment learning in particular.
It could be caused if the browser isn't updated. You could also try installing the latest version of Tensorboard:
pip uninstall tensorflow-tensorboard
pip install tensorboard
Also try using different browsers.
Can you just try going to http://localhost:6006 instead? It looks like your hostname is not one that actually resolves in DNS.

Tensorboard projector not working on colab

The GPU version of Tensorboard is having certain issues in Colab although the CPU version works alright. I could not find much from the docs though. This is the error
Also, I tried the following for installation
As you can see, I tried with both GPU and non-GPU versions and it does not work till I disable the GPU from runtime. Any help shall be appreciated.