Does Google Colab uses local resources too? How can I stop that? - google-colaboratory

I noticed that whenever I open a Google Colab notebook my system fans go high and all of my 4 cores show huge usage (on my ubuntu laptop). Clearly a lot of JS is running on my system.
However, when I host a Jupiter notebook on another machine and use that from my laptop, all the resource usage is normal.
Q: Is there a way to make Google Colab use minimal resources of my PC?
While google colab is an awesome way to share my code (and ask questions), the sound from fan speed annoys me a lot.
p.s; If this is not th right plac to ask this, kindly let me know where can I ask it?

Check if your Google Colab is running in local runtime. By default, it runs on its own Compute Engine, but you do have the option to alter it.
P.S It could also be Google Chrome simply using too many resources when running Colab. Try Edge or other lesser power-hungry browsers.

Related

Use Google Colab Resources on local IDE

I have a big doubt... is see a lot of blog posts where they say that you can use the Colab front-end to edit a local Jupiter Notebook
However I don't see the point... the actual advantage would be to use something like DataSpell or some local IDE, on a remote Notebook on Colab, and use the Colab Resources to do the computations, so you have:
IDE level of suggestions (Colab is pretty slow compared to local IDE)
cloud computing performances and advantages
Hoever, I don't see any blog talking about this... is there any way to do this?

Can I run my own instance of google colab?

I would like to understand how feasible it would be to spin up my own instance of a Colaboratory server that I could run within a closed network. Using the public version is unfortunately not yet an option in my company. I would really like to have something equivalent that I could use internally, which has all of the nice features such as collaborative editing.
Has anyone tried doing this? Is it even possible?
There's no way to spin up a full instance of the Colab service; i.e., the bits that integrate with GSuite / Docs / GCP / TPUs.
But, you can run local backends using the instructions here:
http://research.google.com/colaboratory/local-runtimes.html

Is there a way to connect google colab to my google drive for good?

I found this great question here:
https://stackoverflow.com/questions/48376580/google-colab-how-to-read-data-from-my-google-drive
which helped me to connect the colab to my drive
Here it is as well:
from google.colab import drive
drive.mount('/content/gdrive')
My question: Is there anyway to do this process of the google authentication only once? Colab disconnects from time to time if not in use and than I need to restart the authentication process.
Thanks
Authentication is done per machine; exchanging keys to access drive. Since you always get a new machine on re-connect, you need to re-authenticate.
However, another option is to use an API key for your google drive access. This can be done via theGoogle API Console for the Drive Platform. Essentially you would have one API Token you can over and over again; possibly leading you to store it inside the notebook... where the bad part starts.
If you opt-in on using a token to "manually" mount the drive folder, as soon someone gets a hand on this token (i.e. sharing your notebook, man in the middle, forgetting to delete the key), your drive folder is compromised. That is the reason why my formal answer to this question is: No, you can't.
But since colab provides the whole machine with a unix environment where you can execute arbitrary bash commands, you are in control and leave you with additional resources for further investigation:
https://stackoverflow.com/a/50888878/2763239
https://medium.com/#uditsaini/access-google-drive-and-mount-google-drive-to-colab-notebook-google-ccbca1691d31
https://github.com/googlecolab/colabtools/issues/121#issuecomment-423326300
A recently released feature makes this much simpler. The details are described in this answer:
https://stackoverflow.com/a/60103029/8841057
The short version is that for notebooks in Drive that aren't shared, there's now a GUI option to mount Drive files automatically for a given notebook.

Google Colab variable values lost after VM recycling

I am using a Google Colab Jupyter notebook for algorithm training and have been struggling with an annoying problem. Since Colab is running in a VM environment, all my variables become undefined if my session is idle for a few hours. I come back from lunch and the training dataframe that takes a while to load becomes undefined and I have to read_csv again to load my dataframes.
Does anyone know how to rectify this?
If the notebook is idle for some time, it might get recycled: "Virtual machines are recycled when idle for a while" (see colaboratory faq)
There is also an imposed hard limit for a virtual machine to run (up to about 12 hours !?).
What could also happen is that your notebook gets disconnected from the internet / google colab. This could be an issue with your network. Read more about this here or here
There are no ways to "rectify" this, but if you have processed some data you could add a step to save it to google drive before entering the idle state.
You can use local runtime with Google Colab. Doing so, the Colab notebook will use your own machine's resources, and you won't have any limits. More on this: https://research.google.com/colaboratory/local-runtimes.html
There are various ways to save your data in the process:
you can save on the Notebook's VM filesystem, e. g. pd.to_csv("my_data.csv")
you can import sqlite3 which is the Python implementation of the popular SQLite database. Difference between SQLite and other SQL databases is that the DBMS runs inside your application, and data is saved to the file system of that application. Info: https://docs.python.org/2/library/sqlite3.html
you can save to your google drive, download to your local file system through your browser, upload to GCP... more info here: https://colab.research.google.com/notebooks/io.ipynb#scrollTo=eikfzi8ZT_rW

Dropbox sync bandwidth is it limited?

I have installed dropbox python client for linux and I noticed the sync bandwidth is quite limited:
$ dropbox status
Syncing (252,088 files remaining, 18 days left)
Downloading 252,088 files (35.1 KB/sec, 18 days left)
Is there a way to make it faster?
Note: Yes I have a 100Mbit/s internet connexion...
Firstly, check if there is a 75% cap enabled, as mentioned here
If there isn't then it's probably your Internet, try switching to a different network source (from wireless to wired) or use a different Internet connection. I had the same issue before and it was solved by changing to a different Internet connection, yes I have 100Mbit/s too but it didn't help.
Alternatively
If you already have another synced up dropbox, just copy the files over to the new install of Dropbox, if you're just trying to get the initial sync done.
Also take a look at LAN Sync, a feature in Dropbox
This honestly isn't a SO question because is isn't really a programming question, a forum like Superuser.com might be better suited perhaps.
edit: saw that you already have a superuser account, my bad. :)