Google Colab : Local Runtime use - google-colaboratory

I was currently using google-colab and on the getting started pages, we see:
Local runtime support Colab supports connecting to a Jupyter runtime
on your local machine. For more information, see our documentation.
So, when I saw the documentation I connected my colab notebook to the local runtime, after the installations,etc by using the connected tab.
And when I access the memory info:
!cat /proc/meminfo
The output is as follows:
MemTotal: 3924628 kB
MemFree: 245948 kB
MemAvailable: 1473096 kB
Buffers: 168560 kB
Cached: 1280300 kB
SwapCached: 20736 kB
Active: 2135932 kB
Inactive: 991300 kB
Active(anon): 1397156 kB
Inactive(anon): 560124 kB
Active(file): 738776 kB
Inactive(file): 431176 kB
Unevictable: 528 kB
Mlocked: 528 kB
Which is the memory info for my pc, so certainly the access from the notebook is to my pc? Then how is it any different from my local jupyter-notebook? Now, I can't use the high memory environment of 13 Gigs, nor can I have GPU access.
Would be great if someone can explain!

The main advantages to using Colab with a local backend stem from Drive-based notebook storage: Drive commenting, ACLs, and easy link-based sharing of the finished notebook.
When using Jupyter, sharing notebooks requires sharing files. And, accessing your notebooks from a distinct machine requires installing Jupyter rather than loading a website.

The only benefit is to keep your notebooks in Google Drive.
you can share them easily
you have automatic history/versioning
people can comment on your notebooks
You also have headings with collapsible outline, and probably cleaner UI (if you prefer Colab styling).

TLDR - the short answer is that it's not any different
But, here's an analogy that might help better explain what the point of that is:
Let's pretend Google Colab was something like a video gaming streaming service that enabled users with low-end equipment to play high end graphic demanding video games by hosting the game on their systems. It would make sense if say, we don't have a high end gaming PC or a very strong laptop and we wanted to play a new game that just came out with very high system requirements (which ours barely meets if at all) then naturally we may want to use this streaming service, let's call it Stadia for fun, to play that new game because it lets us play it at 30FPS on 720p resolution for example, whereas maybe using our own computer might give us barely 15 fps at 480p. They would be the people who represent people Like you and I, who want to benefit from the game being run on another system which in this case, would be equivalent to how we want Google Colab to run our iterations on their system. So for us, it wouldn't make sense to have Stadia run locally and use our system resources because there's no benefit in that, even if our saved games were stored locally.
But then there are others, who have high end pc and graphics cards installed, with much better components and resources available to them and let's say they also want to play the same game. Now they could use the same streaming service as us and play at 720p, but since their computer is more powerful and can actually handle let's say the game at 60 FPS on 4k graphics, then they may want to run the game off their own system resources instead of using the streaming service such as Stadia. But that would normally mean, they'd have to get a hardcopy of the game to install it locally on their system and play it that way. And for the sake of the example, let's just pretend it was download only and would require 2 terabytes to install.
So then, if we pretend that stadia had an ability to save them from having download and install the game while still using their systems' resources to provide better graphics while they play, then that would be the case for how or why Colab connecting to a local runtime would serve as a desirable feature for someone. Sharing colab notebooks would be like sharing a game in our theoretical version of stadia, where users wouldn't have to download and install anything so any time there is any update or changes, users would immediately be able to use that new updated version without downloading anything because the actual code (or game install in our metaphor) is run remotely.
Sometimes it's hard to understand things that weren't designed for our use when it contradicts the value which base our decision to use them. Hopefully that helps someone who stumbles across this understand the purpose of it, at least in principle.

Related

is it possible to increase the ram in google colab with another way?

When I run this code in google colab
n = 100000000
i = []
while True:
i.append(n * 10**66)
it happens to me all the time. My data is huge. After hitting 12.72 GB RAM, but I don't immediately get to the crash prompt and the option to increase my RAM.
I have just this Your session crashed after using all available RAM. View runtime logs
What is the solution ? Is there another way ?
You either need to upgrade to Colab Pro or if your computer itself has more RAM than the VM for Colab, you can connect to your local runtime instead.
Colab Pro will give you about twice as much memory as you have now. If that’s enough, and you’re willing to pay $10 per month, that’s probably the easiest way.
If instead you want to use a local runtime, you can hit the down arrow next to “Connect” in the top right, and choose “Connect to local runtime
The policy was changed. However, currently, this workaround works for me:
Open and copy this notebook to your drive. Check if you already have 25gb RAM by hovering over the RAM indicator on the top right (this was the case for me). If not, follow the instructions in the colab notebook.
Source: Github
To double the RAM size of Google Colab use this Notebook, it gives a 25GB RAM! Note: set Runtime type to "None" to double RAM, then change it again to GPU or TPU.
https://colab.research.google.com/drive/155S_bb3viIoL0wAwkIyr1r8XQu4ARwA9?usp=sharing
as you said 12GB
this needs a large RAM,
if you need a small increase you can use colab pro
If you need a large increase and using a deep learning framework my advice you should use :
1- the university computer (ACADEMIC & RESEARCH COMPUTING)
2- using a platform like AWS, GCP, etc 3- you may use your very professional computer using GPU (I didn't recommend this)

Can multiple Colab notebooks share the same Runtime?

In Q1 2019, I ran some experiments and I noticed that Colab notebooks with the same Runtime type (None/GPU/TPU) would always share the same Runtime (i.e., the same VM). For example, I could write a file to disk in one Colab notebook and read it in another Colab notebook, as long as both notebooks had the same Runtime type.
However, I tried again today (October 2019) and it now seems that each Colab notebook gets its own dedicated Runtime.
My questions are:
When did this change happen? Was this change announced anywhere?
Is this always true now? Will Runtimes sometimes be shared and sometimes not?
What is the recommended way to communicate between two Colab notebooks? I'm guessing Google Drive?
Thanks
Distinct notebooks are indeed isolated from one another. Isolation isn't configurable.
For file sharing, I think you're right that Drive is the best bet as described in the docs:
https://colab.research.google.com/notebooks/io.ipynb#scrollTo=u22w3BFiOveA
I have found no easy way of running multiple notebooks within the same runtime. That being said, I have no idea how this effects the quota. On my real computer, I'd limit GPU memory per script and run multiple python threads. They don't let you do this, and I think if you do not use the whole amount of RAM, they should not treat that the same as if you had used all of that GPU for 12 or 24 hrs. They can pool your tasks with other users.

Data usage from any application

I want to read how much data from 3G every app uses. Is this is possible in iOS 5.x ? And in iOS 4.x? My goal is for example:
Maps consumed 3 MB from your data plan
Mail consumed 420 kB from your data plan
etc, etc. Is this possible?
EDIT:
I just found app doing that: Data Man Pro
EDIT 2:
I'm starting a bounty. Extra points goes to the answer that make this clear. I know it is possible (screen from Data Man Pro) and i'm sure the solution is limited. But what is the solution and how to implement this.
These are just hints not a solution. I thought about this many times, but never really started implementing the whole thing.
first of all, you can calculate transferred bytes querying network interfaces, take a look to this SO answer for code and a nice explanation about network interfaces on iOS;
use sysctl or similar system functions to detect which apps are currently running (and for running I mean the process state is set to RUNNING, like the ps or top commands do on OSX. Never tried I just suppose this to be possible on iOS, hoping there are no problems with app running as unprivileged user) so you can deduce which apps are running and save the traffic stats for those apps. Obviously, given the possibility to have applications runnning in background it is hard to determine which app is transferring data.
It also could be possible to retrieve informations about network activity per process/app like nettop does on OSX Lion, unfortunately nettop uses the private framework NetworkStatistics.framework so you can't dig something out it's implementation;
take into account time;
My 2 cents
No, all applications in iOS are sandboxed, meaning you cannot access anything outside of the application. I do not believe this is possible. Neither do I believe data-traffic is saved on this level on the device, hence apple would have implemented it in either the network page or the usage page in Settings.app.
Besides that, not everybody has a "data-plan". E.g. in Sweden its common that data-traffic is free of charge without limit in either size or speed.

IOS process internals - how to get information?

I am looking for an API to monitor the tasks running on a plain iPhone (no jailbreak). Those are about:
look for CPU usage (my main concern).
look for memory usage.
look for disk usage (how many read/write)
look for network usage (how many bytes sent and received by network: 3G, Wifi, GSM).
is it possible to rely on the IOS simulator running on a Mac (or should I test my application directly on the device)?
I think I can look into the system C libraries (sigint, etc.) but I am not sure to be able to retrieve this information except for the current applications running. I know some monitor applications run on the global usage but I would like to be able to find the information process by process.
If someone can provide some links or something useful, I will start a deeper investigation then.
CPU usage can be found by looking at the code from this related question:
iOS - Get CPU usage from application
Memory usage:
Available memory for iPhone OS app
And here are a couple threads that talk about how to find out about the applications or tasks currently running:
Can we retrieve the applications currently running in iPhone and iPad
How to get information about free memory and running processes in an App Store approved app? (Yes, there is one!)
How to get the active processes running in iOS
The answers to these questions may point you in the direction you'd like to head towards.
Good luck!
The Activity Monitor should get you what you're looking for. You should be able to observe RAM, CPU, and VRAM usage for each iOS process. This is a default tool installed with xCode. Very handy.

Carrying and Working on an Entire Development Box from a USB Stick. Feasible?

Lately I have been thinking about investing in a worthy USB pen drive (something along the lines of this), and install Operating Systems on Virtual Machines and start developing on them.
What I have in mind is that I want to be able to carry my development boxes, being a Windows Distribution for .Net development and a Linux Distribution for stuff like RoR, Perl and whatnot, so that I would be able to carry them around where need be...be it work, school, different computers at home etc...
I am thinking of doing this also for backup purposes...ie to backup my almost-single VM file to an external hd, instead of doing routinely updates to my normal Windows Box. I am also thinking about maybe even committing the VM boxes under Source Control (is that even feasible?)
So, am I on the right track with this ? Do you suggest that I try to implement this out?
How feasible is it to have your development box on Virtual Machine that runs from a USB Pen-Drive ?
I absolutely agree with where you are heading. I wish to do this myself.
But if you don't already know, it's not just about drive size, believe it or not USB Flash drives can be much slower than your spinning disk drives!
This can be a big problem if you plan to actually run the VMs directly from the USB drive!
I've tried running a 4GB Windows XP VM on a 32GB Corsair Survivor and the VM was virtually unusuable! Also copying my 4GB VM off and back onto the drive was also quite slow - about 10 minutes to copy it onto the drive.
If you have an esata port I'd highly recommend looking at high-speed ESata options like this Kanguru 32GB ESata/USB Flash drive OR this 32GB one by OCZ.
The read and write speeds of these drives are much higher over ESata than other USB drives. And you can still use them as USB if you don't have an ESata port. Though if you don't have an ESata port you can buy PCI to ESata cards online and even ESata ExpressCards for your laptop.
EDIT: A side note, you'll find the USB flash drives use FAT instead of NTFS. You don't want to use NTFS because it makes a lot more reads & writes on the disk and your drive will only have a limited number of reads & writes before it dies. But by using FAT you'll be limited to max 2GB file size which might be a problem with your VM. If this is the case, you can split your VM disks into 2GB chunks. Also make sure you backup your VM daily incase your drive does reach it's maximum number of writes. :)
This article on USB thumbdrives states,
Never run disk-intensive applications
directly against files stored on the
thumb drive.
USB thumbdrives utilize flash memory and these have a maximum number of writes before going bad and corruption occurs. The author of the previously linked article found it to be in the range of 10,000 - 100,000 writes but if you are using a disk intensive application this could be an issue.
So if you do this, have an aggressive backup policy to backup your work. Similarly, if when you run your development suite, if it could write to the local hard drive as a temporary workspace this would be ideal.
Hopefully you are talking about interpreted language projects. I couldn't imagine compiling a C/C++ of any size on a VM, let alone a VM running off of a USB drive.
I do it quite frequently with Xen, but also include a bare metal bootable kernel on the drive. This is particularly useful when working on something from which a live CD will be based.
The bad side is the bloat on the VM image to keep it bootable across many machines .. so where you would normally build a very lean and mean paravirtualized kernel only .. you have to also include one that has everything including the kitchen sink (up to what you want, i.e. do you need Audio, or token ring, etc?)
I usually carry two sticks, one has Xen + a patched Linux 2.6.26, the other has my various guest images which are ready to boot either way. A debootstrapped copy of Debian or Ubuntu makes a great starting point to create the former.
If nothing else, its fun to tinker with. Sorry to be a bit GNU/Linux centric, but that's what I use exclusively :) I started messing around with this when I had to find an odd path to upgrading my distro, which was two years behind the current one. So, I strapped a guest, installed what I wanted and pointed GRUB at the new LV for my root file system. Inside, I just mounted my old /home LV and away I went.
Check out MojoPac:
http://www.mojopac.com/
Hard-core gamers use it to take world of warcraft with them on the go -- it should work fine for your development needs, at least on Windows. Use cygwin with it for your unix-dev needs.
I used to do this, and found that compiling was so deathly slow, it wasn't worth it.
Keep in mind that USB flash drives are extremely slow (maybe 10 to 100 times slower) compared to hard drives at random write performance (writing lots of small files to a partition which already has lots of files).
A typical build process using GNU tools will create lots of small files - a simple configure script creates thousands of small files and deletes them again just to test the environment before you even start compiling. You could be waiting a long time.