How to call a function in google Colab from a .py file on google drive - google-colaboratory

I have a function with two variables called check(new_doc old_doc) which is written on a .py file called checking.py that I stored on google drive.
I would like to call that function in a new google Colab notebook but when I do so I get a ValueError: I/O operation on closed file.
The function uses the with open method twice to read and check the documents given as a input, maybe this info can help solve the issue.
This is what I tried
from google.colab import drive drive.mount('/content/gdrive') import sys sys.path.insert(0,'/content/gdrive/MyDrive/check folder') from checking import check check(new_doc old_doc)
then I get the ValueError: I/O operation on closed file. Error

Related

Possibility to save uploaded data in Google Colab for reopening

I started recently solving Kaggle competitions, using 2 computers (laptop and PC). Kaggle gives big amout of data for training ML.
The biggest problem for me is downloading that data, it takes about 30 GB, and bigger issue, unzipping it. I was working on my laptop, but I decided to move to PC. I saved the ipynb file and closed laptop.
After opening this file I saw that all unzipped data went missing and I need to spend 2h for downloading and unzipping it once again.
Is it possible to save all unzipped data with this notebook? Or maybe it's stored somewhere on Google Disk?
You can leverage the storage capacity of GoogleDrive. Colab allows you to have this data stored on your Drive and access it from colab notbook as follows:
from google.colab import drive
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import pandas as pd
drive.mount('/content/gdrive')
img = mpimg.imread(r'/content/gdrive/My Drive/top.bmp') # Reading image files
df = pd.read_csv('/content/gdrive/My Drive/myData.csv') # Loading CSV
When it mounts, it would ask you to visit a particular url to grant permission for accessing drive. Just paste the token returned. Needs to be done only once.
The best thing about colab is you can also run shell cmds from code, all you need to do is to prefix the commands with a ! (bang). Useful when you need to unzip etc.
import os
os.chdir('gdrive/My Drive/data') #change dir
!ls
!unzip -q iris_data.zip
df3 = pd.read_csv('/content/gdrive/My Drive/data/iris_data.csv')
Note: Since you have specified that the data is about 30GB, this may not be useful if you are on the free tier provided by Google (as it gives only 15GB per account) you may have to look elsewhere.
You can also visit this particular question for more solutions on Kaggle integration with Google Colab.

How can I download a json Dataframe in Google Colab?

I've been working with a dataframe in google colab, and i convert it to a json format using df.to_json(), now i stuck on how to download it into my local disk (or google drive). i found the answers but in csv format, not json.
Any help is appreciated - thanks!
You can proceed in steps: 1) save the output of to_json() to a file on the Colab backend. Then, use the built-in download helper
from google.colab import files
files.download('colab_file_name')
Here's a complete example that starts with a randomly generated DataFrame:
https://colab.research.google.com/drive/1K95vU0gUJW4iJ4FaHxZ1HSQguOP1flEg

Write out file with google colab

Was there a way to write out files with google colab?
For example, if I use
import requests
r = requests.get(url)
Where will those files be stored? Can they be found?
And similarly, can I get the file I outputted via say tensorflow save function
saver=tf.Saver(....)
...
path = saver.save(sess, "./my_model.ckpt")
Thanks!
In your first example, the data is still in r.content. So you also need to save them first with open('data.dat', 'wb').write(r.content)
Then you can download them with files.download
from google.colab import files
files.download('data.dat')
Downloading your model is the same:
files.download('my_model.ckpt')
I found it is easier to first mount your Google drive to the non-persistent VM and then use os.chdir() to change your current working folder.
After doing this, you can do the exactly same stuff as in local machine.
I have a gist listing several ways to save and transfer files between Colab VM and Google drive, but I think mounting Google drive is the easiest approach.
For more details, please refer to mount_your_google_drive.md in this gist
https://gist.github.com/Joshua1989/dc7e60aa487430ea704a8cb3f2c5d6a6

Export Excel files from Google Colab

I use following codes to save some data frame data, in Google Colab. But it is saved in the "local file system", not in my computer nor Google Drive. How can I get the Excel file from there?
Thanks!
writer = pd.ExcelWriter('hey.xlsx')
df.to_excel(writer)
writer.save()
from google.colab import files
files.download('result.csv')
Use Google Chrome. Firefox shows Network Error.
You'll want to upload the file using something like:
from google.colab import files
uploaded = files.upload()
Then, you can pick out the file data using something like uploaded['your_file_here.xls'].
There's a distinct question that includes recipes for working with Excel files in Drive that might be useful:
https://stackoverflow.com/a/47440841/8841057

Saving Variable state in Colaboratory

When I am running a Python Script in Colaboratory, it's running all previous code cell.
Is there any way by which previous cell state/output can be saved and I can directly run next cell after returning to the notebook.
The outputs of Colab cells shown in your browser are stored in notebook JSON saved to Drive. Those will persist.
If you want to save your Python variable state, you'll need to use something like pickle to save to a file and then save that file somewhere outside of the VM.
Of course, that's a bit a trouble. One way to make things easier is to use a FUSE filesystem to mount some persistant storage where you can easily save regular files but have them persist beyond the lifetime of the VM.
An example of using a Drive FUSE wrapper to do this is in this example notebook:
https://colab.research.google.com/notebook#fileId=1mhRDqCiFBL_Zy_LAcc9bM0Hqzd8BFQS3
This notebook shows the following:
Installing a Google Drive FUSE wrapper.
Authenticating and mounting a Google Drive backed filesystem.
Saving local Python variables using pickle as a file on Drive.
Loading the saved variables.
It'a a nope. As #Bob in this recent thread says: "VMs time out after a period of inactivity, so you'll want to structure your notebooks to install custom dependencies if needed."