import local file to google colab - google-colaboratory

I don't understand how colab works with directories, I created a notebook, and colab put it in /Google Drive/Colab Notebooks.
Now I need to import a file (data.py) where I have a bunch of functions I need. Intuition tells me to put the file in that same directory and import it with:
import data
but apparently that's not the way...
I also tried adding the directory to the set of paths but I am specifying the directory incorrectly..
Can anyone help with this?
Thanks in advance!

Colab notebooks are stored on Google Drive. But it is run on another virtual machine. So, you need to copy your data.py there too. Do this to upload data.py through Colab.
from google.colab import files
files.upload()
# choose the file on your computer to upload it then
import data

Now google is officially providing support for accessing and working with Gdrive at ease.
You can use the below code to mount your drive to Colab:
from google.colab import drive
drive.mount('/gdrive')
%cd /gdrive/My\ Drive/{location you want to move}

To easily upload a local file you can use the new Google Colab feature:
click on right arrow on the left of your screen (below the Google
Colab logo)
select Files tab
click Upload button
It will open a popup to choose file to upload from your local filesystem.

To upload Local files from system to collab storage/directory.
from google.colab import files
def getLocalFiles():
_files = files.upload()
if len(_files) >0:
for k,v in _files.items():
open(k,'wb').write(v)
getLocalFiles()

So, here is how I finally solved this. I have to point out however, that in my case I had to work with several files and proprietary modules that were changing all the time.
The best solution I found to do this was to use a FUSE wrapper to "link" colab to my google account. I used this particular tool:
https://github.com/astrada/google-drive-ocamlfuse
There is an example of how to set up your environment there, but here is how I did it:
# Install a Drive FUSE wrapper.
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
# Generate auth tokens for Colab
from google.colab import auth
auth.authenticate_user()
# Generate creds for the Drive FUSE library.
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
At this point you'll have installed the wrapper and the code above will generate a couple of links for you to authorize access to your google drive account.
The you have to create a folder in the colab file system (remember this is not persistent, as far as I know...) and mount your drive there:
# Create a directory and mount Google Drive using that directory.
!mkdir -p drive
!google-drive-ocamlfuse drive
print ('Files in Drive:')
!ls drive/
the !ls command will print the directory contents so you can check it works, and that's it. You now have all the files you need and you can make changes to them with no further complications. Remember that you may need to restar the kernel to update the imports and variables.
Hope this works for someone!

you can write following commands in colab to mount the drive
from google.colab import drive
drive.mount('/content/gdrive')
and you can download from some external url into the drive through simple linux command wget like this
!wget 'https://dataverse.harvard.edu/dataset'

Related

Rendering in Google Colab gone wrong

I am a newbie Blender user. I made my first animation yesterday and tried to render it in Google Colab. I ran a code which worked for a Youtuber who is running Blender2.91-linux version, but the same code showed error when I ran it.
I am currently using Windows 10 and really new at Blender. I need a working code that can successfully render Animation made with blender in Colab.
This is the code that I found online and ran. Please help :(
#Download Blender from Repository
!wget http://download.blender.org/release/Blender2.93/blender-2.93.0-linux-x64.tar.xz
#Install Blender
!tar xf blender-2.93.0-linux-x64.tar.xz
#Connect Google Drive
from google.colab import drive
drive.mount('/gdrive')
#Set Paths to Blender files
filename = '/gdrive/MyDrive/SHIP IN WATER With Particles.blend'
#Render an animation
!sudo ./blender-2.93.0-linux-x64/blender -b $filename -noaudio -E 'Cycles' -o '//image_####' -s 0 -e 72 -a -- --cycles-device OpenCL
The output of the last line came :
sudo: ./blender-2.93.0-linux-x64/blender: command not found
In short, I want a working code that can help me render Animation made in Blender in Google Colab.
Thank you in advance.... :)
The problem with your code is that the folder name is not the same after extracting from the tar file. Here is what you can do to fix your issue:
#Download Blender from Repository
!wget http://download.blender.org/release/Blender2.93/blender-2.93.0-linux-x64.tar.xz
#Install Blender
!tar xf blender-2.93.0-linux-x64.tar.xz
#Connect Google Drive
from google.colab import drive
drive.mount('/gdrive')
#Set Paths to Blender files
filename = '/gdrive/MyDrive/SHIP IN WATER With Particles.blend'
After that, add this command: !ls – you will get list files and folders in the current directory. Copy the extracted folder name from there and replace the old folder name with it:
OLD
./blender-2.93.0-linux-x64/blender
NEW
./NewFoldeName/blender

Best Practice for Kaggle Datasets with Colab

I was wondering if anyone could confirm the best practice for downloading kaggle datasets to our colab notebooks?
I have seen code examples like the one below where we download the API token file and upload it to the environment, is that the best practice or is there a different/simpler/better approach?
Thanks in advance!
Jacob
from google.colab import files
!pip install -q kaggle
files.upload()
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 /root/.kaggle/kaggle.json
!kaggle datasets download -d alxmamaev/flowers-recognition
There are 2 approaches that are both convenient:
1) Save your kaggle.json in GDrive. Then mount by just clicking (in the left pane). Then, copy it here.
!mkdir -p ~/.kaggle
!cp "drive/My Drive/kaggle.json" ~/.kaggle/
# the rest is the same
2) Embed the kaggle.json in Colab itself.
!mkdir ~/.kaggle
!echo '{"username":"korakot","key":"8db2xxx"}' > ~/.kaggle/kaggle.json
# the rest is the same
If you are worried, use the first which is more secure.
If you are lazy, use the second.

How to use google colab to run the file 'train.py' in MyFolder?

I have uploaded a folder of google drive with all of my files .It's structure is like -
MyFolder
-Images
-train.py
-classify.py
-Facenet.py
I mounter the folder by following instruction on How to Upload Many Files to Google Colab?
In my computer i simply go to MyFolder , i open the terminal and i run python train.py .How to do same thing in google colab ? I have uploaded MyFolder on google drive.
Edit : After mounting i changed my directory to MyFolder (credits : Google colab changing directory) .I have runned train.py , it's still running .I hoper everything works fine . So now all i want to know that all changes caused after running the script will be stored in MyFolder of drive itself ?
My problem is solved .Two links mentioned are enough.
First upload MyFolder on google drive .Then go to google colab (using same gmail account).Copy and paste the following script and run on google colab .(see How to Upload Many Files to Google Colab?)
# Install a Drive FUSE wrapper.
# https://github.com/astrada/google-drive-ocamlfuse
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
# Generate auth tokens for Colab
from google.colab import auth
auth.authenticate_user()
# Generate creds for the Drive FUSE library.
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
# Create a directory and mount Google Drive using that directory.
!mkdir -p My Drive
!google-drive-ocamlfuse My Drive
!ls My Drive/
# Create a file in Drive.
!echo "This newly created file will appear in your Drive file list." > My Drive/created.txt
After than you can use command %cd Drive/ to move to the drive and then command %cd MyFolder to move inside Myfolder .see Google colab changing directory .You can run terminal commands of Linux using % or ! at the beginning of you command statement.to train the data run the corresponding command .
!python train.py

Mount multiple drives in google colab

I use this function to mount my google drive
from google.colab import drive
drive.mount('/content/drive', force_remount=True)
and then copy files from it like this
!tar -C "/home/" -xvf '/content/drive/My Drive/files.tar'
I want to copy files from 2 drives, but when i try to run first script it just remount my 1st drive
How can i mount 1st drive, copy files, then mount another drive and copy files from 2nd drive?
Just in case anyone really needs to mount more than one drive, here's a workaround for mounting 2 drives.
First, mount the first drive using
from google.colab import drive
drive.mount('/drive1')
Then, use the following script to mount the second drive.
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
!mkdir -p /drive2
!google-drive-ocamlfuse /drive2
Now, you will be able to access files from the first drive from /drive1/My Drive/ and those of the second drive from /drive2/ (the second method doesn't create the My Drive folder automatically).
Cheers!
Fun Fact: The second method was actually a commonly used method to mount Google drive in Colab environment before Google came out with google.colab.drive
The colab drive module doesn't really support what you describe.
It might be simplest to share the files/folders you want to read from the second account's Drive to the first account's Drive (e.g. drive.google.com) and then read everything from the same mount.
If you're getting an exception with Suyog Jadhav's method:
MessageError: Error: credential propagation was unsuccessful
Follow the steps 1 to 3 described by Alireza Mazochi
https://stackoverflow.com/a/69881106/10214361
Follow these steps:
1- Run the below code:
!sudo add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!sudo apt-get update -qq 2>&1 > /dev/null
!sudo apt -y install -qq google-drive-ocamlfuse 2>&1 > /dev/null
!google-drive-ocamlfuse
2- Give permissions to GFUSE
From the previous step, you get an error like this. Click on the link that locates in the previous error message and authenticate your account.
Failure("Error opening URL:https://accounts.google.com/o/oauth2/auth?client_id=... ")
3- Run the below code:
!sudo apt-get install -qq w3m # to act as web browser
!xdg-settings set default-web-browser w3m.desktop # to set default browser
%cd /content
!mkdir drive
%cd drive
!mkdir MyDrive
%cd ..
%cd ..
!google-drive-ocamlfuse /content/drive/MyDrive
After this step, you will have a folder with your second drive.
There is Rclone which is a command-line program to manage files on cloud storage. It is a feature-rich alternative to cloud vendors' web storage interfaces. Over 40 cloud storage products support rclone including S3 object stores, business & consumer file storage services, as well as standard transfer protocols.
In this url you will find how to set it up in Colab. you can even link one drive and other cloud storage products too.
https://towardsdatascience.com/why-you-should-try-rclone-with-google-drive-and-colab-753f3ec04ba1

How do I upload a file to Google Colaboratory that is already on Google Drive?

https://colab.research.google.com/notebooks/io.ipynb#scrollTo=KHeruhacFpSU
In this notebook help it explains how to upload a file to drive and then download to Colaboratory but my files are already in drive.
Where can I find the file ID ?
# Download the file we just uploaded.
#
# Replace the assignment below with your file ID
# to download a different file.
#
# A file ID looks like: 1uBtlaggVyWshwcyP6kEI-y_W3P8D26sz
file_id = 'target_file_id'
My advice would be to use pydrive for this (docs).
You could also do this via the Drive UI -- I think the shortest path is to select the file, click "Get shareable link" -- it's the id parameter in the resulting URL. (If the file wasn't shared when you started, you'll want to then uncheck the green "link" button.)
Connect to Gdrive using below snippet.
You will have to authenticate twice using the link from cell output. But once this step is taken care of you can load files from drive and save to drive directly as you would do locally.
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
!mkdir -p drive
!google-drive-ocamlfuse drive
Read CSV using pandas
df = pd.read_csv('drive/path/file.csv')
Save CSV
Use index = False if you don't need index as first col in csv.
df.to_csv('drive/path/file.csv',index = False)
You can use curlWget extension in chrome. If you want to download anything, just click on download and as soon as it started downloading you can cancel the download. Go to curlwget and get the whole link of file or data, just copy it.
Go to colab, add a cell and paste it, just put ! mark before the copied data from curlwget.
Better to use colab api
from google.colab import drive
drive.mount('/content/drive')