Link to images within google drive from a colab notebook - google-colaboratory

I would like to store image files on a drive and link to them from a collaboration notebook. Is this possible? For example.
google-drive/
notebook.ipynb
images/
pic.jpg
Within notebook.ipynb markdown cell:
![Alternate Text](images/pic.jpg)

Google always loses me in the details. That's where the devil is. Based on the previous answer, these are my steps to make an image sharable so it can be used in a notebook.
In your Google Drive, create a public folder as follows:
a. Create a new folder and name it Image, for instance.
b. Right-click the folder just created and select Share from the drop-down menu.
c. In the popup dialog window, click the Advanced link in the bottom right.
d. In the section Who has access, select "Public on the web - Anyone on the Internet can find and view".
e. Click the Done button.
Store the image in the folder you just created.
Right-click on the image, and from the drop-down menu select Share.
Click the Copy link button. A link is copied in your clipboard.
From the link, copy the long image ID made of numbers and letters.
Replace imageID in the following URL: https://docs.google.com/uc?export=download&id= with the ID you copied in the previous step.
Copy this URL in the markdown image tag such as:![test](https://docs.google.com/uc?export=download&id=mmXXDD123zDGV51twxSCGAAX23)

A possible alternative to the great answer by Arpit Gupta for images are publicly shared: Get the ID of your file and prepend this line:
https://docs.google.com/uc?export=download&id=
Grabbed it from this forum post.

Use the following code:
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
After running above command it will ask for the key. Click on the provided link and verify it.
Create a folder in drive using:
!mkdir -p colabData
!google-drive-ocamlfuse colabData
After this you can use the drive as if its connected locally:
%%bash
echo "Hello World...!!!" > colabData/hello.txt
ls colabData

Related

How to increase Google Colab storage capacity

I am working on a Dataset of 70gb with GPU.
Can someone suggest any possible way to make a new Notebook with more than 300Gbs Available or any possible way to go back to previous state.
You can achieve by the following approach to make a new Notebook with more than 300GB:
Buy extra space in the corresponding account, either it will be in gdrive or in GCP storage then mount in Notebook.
Also, there is an option to mount multiple drive sources, sharing the following example to mount multiple gdrive for your reference. It will be useful if you have multiple gdrive accounts having more space altogether and feasible to use big space from different mount points.
#Code base to mount first mount point:
from google.colab import drive
drive.mount('/content/drive01')
#Follow the verification process, and enter the token
#Code base to mount second mount point:
!apt-get install -y -qq software-properties-common module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
from oauth2client.client import GoogleCredentials
import getpass
auth.authenticate_user()
creds = GoogleCredentials.get_application_default()
prompt = !google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass(prompt[0] + '\n\nEnter verification code: ')
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
!sudo mkdir /content/drive02
!google-drive-ocamlfuse /content/drive02
Follow the verification process, and enter the token twice. Please,
note that you need to click on the account authentication URL twice
and enter the token separately.
Credit (Some issue has been fixed) -
https://www.youtube.com/watch?v=qQ3CUHjbJ0w

How to use google colab to run the file 'train.py' in MyFolder?

I have uploaded a folder of google drive with all of my files .It's structure is like -
MyFolder
-Images
-train.py
-classify.py
-Facenet.py
I mounter the folder by following instruction on How to Upload Many Files to Google Colab?
In my computer i simply go to MyFolder , i open the terminal and i run python train.py .How to do same thing in google colab ? I have uploaded MyFolder on google drive.
Edit : After mounting i changed my directory to MyFolder (credits : Google colab changing directory) .I have runned train.py , it's still running .I hoper everything works fine . So now all i want to know that all changes caused after running the script will be stored in MyFolder of drive itself ?
My problem is solved .Two links mentioned are enough.
First upload MyFolder on google drive .Then go to google colab (using same gmail account).Copy and paste the following script and run on google colab .(see How to Upload Many Files to Google Colab?)
# Install a Drive FUSE wrapper.
# https://github.com/astrada/google-drive-ocamlfuse
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
# Generate auth tokens for Colab
from google.colab import auth
auth.authenticate_user()
# Generate creds for the Drive FUSE library.
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
# Create a directory and mount Google Drive using that directory.
!mkdir -p My Drive
!google-drive-ocamlfuse My Drive
!ls My Drive/
# Create a file in Drive.
!echo "This newly created file will appear in your Drive file list." > My Drive/created.txt
After than you can use command %cd Drive/ to move to the drive and then command %cd MyFolder to move inside Myfolder .see Google colab changing directory .You can run terminal commands of Linux using % or ! at the beginning of you command statement.to train the data run the corresponding command .
!python train.py

Mount multiple drives in google colab

I use this function to mount my google drive
from google.colab import drive
drive.mount('/content/drive', force_remount=True)
and then copy files from it like this
!tar -C "/home/" -xvf '/content/drive/My Drive/files.tar'
I want to copy files from 2 drives, but when i try to run first script it just remount my 1st drive
How can i mount 1st drive, copy files, then mount another drive and copy files from 2nd drive?
Just in case anyone really needs to mount more than one drive, here's a workaround for mounting 2 drives.
First, mount the first drive using
from google.colab import drive
drive.mount('/drive1')
Then, use the following script to mount the second drive.
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
!mkdir -p /drive2
!google-drive-ocamlfuse /drive2
Now, you will be able to access files from the first drive from /drive1/My Drive/ and those of the second drive from /drive2/ (the second method doesn't create the My Drive folder automatically).
Cheers!
Fun Fact: The second method was actually a commonly used method to mount Google drive in Colab environment before Google came out with google.colab.drive
The colab drive module doesn't really support what you describe.
It might be simplest to share the files/folders you want to read from the second account's Drive to the first account's Drive (e.g. drive.google.com) and then read everything from the same mount.
If you're getting an exception with Suyog Jadhav's method:
MessageError: Error: credential propagation was unsuccessful
Follow the steps 1 to 3 described by Alireza Mazochi
https://stackoverflow.com/a/69881106/10214361
Follow these steps:
1- Run the below code:
!sudo add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!sudo apt-get update -qq 2>&1 > /dev/null
!sudo apt -y install -qq google-drive-ocamlfuse 2>&1 > /dev/null
!google-drive-ocamlfuse
2- Give permissions to GFUSE
From the previous step, you get an error like this. Click on the link that locates in the previous error message and authenticate your account.
Failure("Error opening URL:https://accounts.google.com/o/oauth2/auth?client_id=... ")
3- Run the below code:
!sudo apt-get install -qq w3m # to act as web browser
!xdg-settings set default-web-browser w3m.desktop # to set default browser
%cd /content
!mkdir drive
%cd drive
!mkdir MyDrive
%cd ..
%cd ..
!google-drive-ocamlfuse /content/drive/MyDrive
After this step, you will have a folder with your second drive.
There is Rclone which is a command-line program to manage files on cloud storage. It is a feature-rich alternative to cloud vendors' web storage interfaces. Over 40 cloud storage products support rclone including S3 object stores, business & consumer file storage services, as well as standard transfer protocols.
In this url you will find how to set it up in Colab. you can even link one drive and other cloud storage products too.
https://towardsdatascience.com/why-you-should-try-rclone-with-google-drive-and-colab-753f3ec04ba1

import local file to google colab

I don't understand how colab works with directories, I created a notebook, and colab put it in /Google Drive/Colab Notebooks.
Now I need to import a file (data.py) where I have a bunch of functions I need. Intuition tells me to put the file in that same directory and import it with:
import data
but apparently that's not the way...
I also tried adding the directory to the set of paths but I am specifying the directory incorrectly..
Can anyone help with this?
Thanks in advance!
Colab notebooks are stored on Google Drive. But it is run on another virtual machine. So, you need to copy your data.py there too. Do this to upload data.py through Colab.
from google.colab import files
files.upload()
# choose the file on your computer to upload it then
import data
Now google is officially providing support for accessing and working with Gdrive at ease.
You can use the below code to mount your drive to Colab:
from google.colab import drive
drive.mount('/gdrive')
%cd /gdrive/My\ Drive/{location you want to move}
To easily upload a local file you can use the new Google Colab feature:
click on right arrow on the left of your screen (below the Google
Colab logo)
select Files tab
click Upload button
It will open a popup to choose file to upload from your local filesystem.
To upload Local files from system to collab storage/directory.
from google.colab import files
def getLocalFiles():
_files = files.upload()
if len(_files) >0:
for k,v in _files.items():
open(k,'wb').write(v)
getLocalFiles()
So, here is how I finally solved this. I have to point out however, that in my case I had to work with several files and proprietary modules that were changing all the time.
The best solution I found to do this was to use a FUSE wrapper to "link" colab to my google account. I used this particular tool:
https://github.com/astrada/google-drive-ocamlfuse
There is an example of how to set up your environment there, but here is how I did it:
# Install a Drive FUSE wrapper.
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
# Generate auth tokens for Colab
from google.colab import auth
auth.authenticate_user()
# Generate creds for the Drive FUSE library.
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
At this point you'll have installed the wrapper and the code above will generate a couple of links for you to authorize access to your google drive account.
The you have to create a folder in the colab file system (remember this is not persistent, as far as I know...) and mount your drive there:
# Create a directory and mount Google Drive using that directory.
!mkdir -p drive
!google-drive-ocamlfuse drive
print ('Files in Drive:')
!ls drive/
the !ls command will print the directory contents so you can check it works, and that's it. You now have all the files you need and you can make changes to them with no further complications. Remember that you may need to restar the kernel to update the imports and variables.
Hope this works for someone!
you can write following commands in colab to mount the drive
from google.colab import drive
drive.mount('/content/gdrive')
and you can download from some external url into the drive through simple linux command wget like this
!wget 'https://dataverse.harvard.edu/dataset'

How do I upload a file to Google Colaboratory that is already on Google Drive?

https://colab.research.google.com/notebooks/io.ipynb#scrollTo=KHeruhacFpSU
In this notebook help it explains how to upload a file to drive and then download to Colaboratory but my files are already in drive.
Where can I find the file ID ?
# Download the file we just uploaded.
#
# Replace the assignment below with your file ID
# to download a different file.
#
# A file ID looks like: 1uBtlaggVyWshwcyP6kEI-y_W3P8D26sz
file_id = 'target_file_id'
My advice would be to use pydrive for this (docs).
You could also do this via the Drive UI -- I think the shortest path is to select the file, click "Get shareable link" -- it's the id parameter in the resulting URL. (If the file wasn't shared when you started, you'll want to then uncheck the green "link" button.)
Connect to Gdrive using below snippet.
You will have to authenticate twice using the link from cell output. But once this step is taken care of you can load files from drive and save to drive directly as you would do locally.
!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}
!mkdir -p drive
!google-drive-ocamlfuse drive
Read CSV using pandas
df = pd.read_csv('drive/path/file.csv')
Save CSV
Use index = False if you don't need index as first col in csv.
df.to_csv('drive/path/file.csv',index = False)
You can use curlWget extension in chrome. If you want to download anything, just click on download and as soon as it started downloading you can cancel the download. Go to curlwget and get the whole link of file or data, just copy it.
Go to colab, add a cell and paste it, just put ! mark before the copied data from curlwget.
Better to use colab api
from google.colab import drive
drive.mount('/content/drive')