Google Colab Pro timeout - google-colaboratory

I have trained and validated my model in Colab Pro, but when it comes to testing and print the confusion report, my Colab instance disconnects.
I have tried the solutions provided in the answers to this question without success. How can I prevent my Colab Pro session from disconnecting?

Related

How can I embed Google credentials for Drive/BQ to scheduled colab notebooks?

I am a Colab Pro user and am beginning to test out the scheduled notebook features. Often, my scheduled notebooks include fetching from or saving items to Google Drive, or using pandas.io to access and query data from BigQuery. How can I embed my credentials to ensure the colab notebook runs as expected when scheduled and the outputs go to the appropriate destinations?
Thanks!

Using Google Colab -- GPU Device not found error

Not sure how to proceed - ive refreshed/restarted the runtime several times, switching hardware acceleration setting to none, then back to GPU, back to none, to TPU and so on just to refresh it. Using GPU setting currently and receiving this error: Google-colab error
Don't use a VPN while connecting to Google Colab or else you will think that you're connected to the GPU

Does Google Colab uses local resources too? How can I stop that?

I noticed that whenever I open a Google Colab notebook my system fans go high and all of my 4 cores show huge usage (on my ubuntu laptop). Clearly a lot of JS is running on my system.
However, when I host a Jupiter notebook on another machine and use that from my laptop, all the resource usage is normal.
Q: Is there a way to make Google Colab use minimal resources of my PC?
While google colab is an awesome way to share my code (and ask questions), the sound from fan speed annoys me a lot.
p.s; If this is not th right plac to ask this, kindly let me know where can I ask it?
Check if your Google Colab is running in local runtime. By default, it runs on its own Compute Engine, but you do have the option to alter it.
P.S It could also be Google Chrome simply using too many resources when running Colab. Try Edge or other lesser power-hungry browsers.

Google Colab variable values lost after VM recycling

I am using a Google Colab Jupyter notebook for algorithm training and have been struggling with an annoying problem. Since Colab is running in a VM environment, all my variables become undefined if my session is idle for a few hours. I come back from lunch and the training dataframe that takes a while to load becomes undefined and I have to read_csv again to load my dataframes.
Does anyone know how to rectify this?
If the notebook is idle for some time, it might get recycled: "Virtual machines are recycled when idle for a while" (see colaboratory faq)
There is also an imposed hard limit for a virtual machine to run (up to about 12 hours !?).
What could also happen is that your notebook gets disconnected from the internet / google colab. This could be an issue with your network. Read more about this here or here
There are no ways to "rectify" this, but if you have processed some data you could add a step to save it to google drive before entering the idle state.
You can use local runtime with Google Colab. Doing so, the Colab notebook will use your own machine's resources, and you won't have any limits. More on this: https://research.google.com/colaboratory/local-runtimes.html
There are various ways to save your data in the process:
you can save on the Notebook's VM filesystem, e. g. pd.to_csv("my_data.csv")
you can import sqlite3 which is the Python implementation of the popular SQLite database. Difference between SQLite and other SQL databases is that the DBMS runs inside your application, and data is saved to the file system of that application. Info: https://docs.python.org/2/library/sqlite3.html
you can save to your google drive, download to your local file system through your browser, upload to GCP... more info here: https://colab.research.google.com/notebooks/io.ipynb#scrollTo=eikfzi8ZT_rW

Running TensorFlow trainer with Cloud ML Engine on TPU produces google.rpc.QuotaFailure

I have developed a TensorFlow model on Cloud ML Engine with scaleTier: BASIC.
Running its trainer experimentally on a GPU with scaleTier: BASIC_GPU works fine. But an attempt of running it on a TPU with scaleTier: BASIC_TPU produces this error message:
type.googleapis.com/google.rpc.QuotaFailure
The request for 1 TPU_V2 accelerators exceeds the allowed maximum
of 30 K80, 30 P100.
Where does this limitation come from and can it be lifted e.g. by enabling another API or increasing my initial budget?
As announced at Google Cloud Next '18, Cloud TPUs are now available to everyone, without whitelisting.
To enable them for Cloud ML Engine, go here:
https://cloud.google.com/ml-engine/docs/tensorflow/using-tpus
...scroll down to the heading "Authorize your Cloud TPU to access your project", and follow the instructions there. In short, you need to provide IAM access of your resources to the TPU that you have created.
I tried the same thing and got the same result. The documentation implies that TPUs are available to everyone, but that's not the case. To the best of my knowledge, you have to specially request TPU access (I filled out the request but didn't get a response).