Mount Google drive with Pydrive - google-colaboratory

I was using google colab and using this code to access my google drive:
from google.colab import drive
drive.mount('/content/gdrive')
It works well but the authentification doesn't last long and I don't want to re-enter my credentials all the time. So I tried to use Pydrive to save my credentials to a file (using this answer):
!pip install pydrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
gauth = GoogleAuth()
gauth.LoadCredentialsFile("mycreds.txt")
gauth.Authorize()
drive = GoogleDrive(gauth)
but I can only write files remotely with this solution, and I would like to be able to mount my google drive entirely so that I can easily use unix commands. Is there a way to do that?

PyDrive doesn't create a FUSE mount, so doesn't work for you intended purpose.
Authentication of drive.mount() should last the lifetime of the assigned VM, and no option is going to outlast a VM's assignment, so I don't think what you want is possible today.

I am looking for answer for this too, and it appears that most easiest way to achieve it is that using colab pro version.
you can find more info here

Related

how to make directory on the google colab?

how to make directory on the google colab?
I can not make the new directory on google driver on google colab?
If you are looking for creating new folder in the 'Files' section (to the left from notebook) - you can run in code shell:
!mkdir new_folder
Just remember, that Colab is a temporary environment with an idle timeout of 90 minutes and an absolute timeout of 12 hours.
You can use !mkdir foldername command to create folder in the current working directory. It will be by default /content.
If you want to create directory with specific condition then use os.mkdir('folder_path') to create the directory.
import os.path
from os import path
if path.exists('/content/malayalam_keyphrase_extraction') == False:
os.mkdir('/content/malayalam_keyphrase_extraction')
os.chdir('/content/malayalam_keyphrase_extraction')
!pwd
!ls
or
!mkdir malayalam_keyphrase_extraction
NB: malayalam_keyphrase_extraction is the folder name
from google.colab import drive
drive.mount('/content/drive')
mkdir "/content/drive/My Drive/name"
The way you want to do this is as follows:
Mount your drive (i.e. Connect your Colab to your Google Drive):
from google.colab import drive
drive.mount('/content/gdrive')
Use the os library. It has most of your typical bash commands which you can run right from your python script:
import os
path = "gdrive/MyDrive/Rest/Of/Path"
os.mkdir(path)
Do like this, then see the result
You can use feature upload file(s) in GUI.

Accessing files in Google Colab

I am new to Google Colab. I have used the following code to download a data set from Kaggle:
!pip install kaggle
import os
os.environ['KAGGLE_USERNAME'] = "xxxxxxxxx"
os.environ['KAGGLE_KEY'] = "xxxxxxxxxxxxxxxxxxxxxxxxxx"
!kaggle competitions download -c dogs-vs-cats-redux-kernels-edition
os.chdir('/content/')
#dowloaded data in this directory
I can access all the data now, but where is this data stored? In my Google Drive? But I can't find it. What if I want to access this same data from a different notebook running in Colab? Both notebooks are stored in my Google Drive, but seem to have their own different '/content/' folders.
to store the data in google drive you have to link the drive first using :
from google.colab import drive
drive.mount('/content/drive')
then you can access the drive by navigating through "/content/drive/My Drive/"
so you can download the data to your drive using the following command :
!kaggle competitions download -p "/content/drive/My Drive/" -c dogs-vs-cats-redux-kernels-edition

how to run google bigquery locally outside gcp or google colab notebooks?

I am trying to run google bigquery in jupyter notebook on a local host on my pc but turns out that its not working,whereas its working fine in google vms in gcp and google colab notebooks.
Tried everything but nothing seems to work.
from google.cloud import bigquery
ModuleNotFoundErro Traceback (most recent call last)
<ipython-input-1-1035661e8528> in <module>
----> 1 from google.cloud import bigquery
ModuleNotFoundError: No module named 'google'
You can connect to BigQuery from an environment which is outside GCP.
You need to setup two things:
Bigquery client library of your choice of language. Looking at the above code, it looks like you want to use python. You can install Bigquery python client lib by running
pip install --upgrade google-cloud-bigquery
Authentication to BigQuery -
a. Get your GCP creds by running following command:
gcloud auth application-default login
This should create a credential JSON file at location "~/.config/gcloud/"
b. You can set an environment variable pointing to the JSON creds file on the command line
export GOOGLE_APPLICATION_CREDENTIALS="~/.config/gcloud/application_default_credentials.json"
Or, you can set the above environment variable in your python program by adding the following lines:
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] =
'~/.config/gcloud/application_default_credentials.json'
Hope this helps.

How to unmount drive in Google Colab and remount to another drive?

I mounted to Google drive account A. Now I want to switch to account B but I cannot do that, because there is no way for me to enter a new authentication key when executing drive.mount().
What I have tried and failed:
restart browser, restart computer
use force_remount=True in drive.mount(), it will only automatically remount account A. Not asking me for new mounting target.
change account A password
change run-time type from GPU to None and back to GPU
open everything in incognito mode
sign out all google accounts
How can I:
forget previous authentication key so it will ask me for a new one?
dismount drive and forget previous authentication key?
I have found the 'Restart runtime...' not to work, and changing permission too much of a hassle.
Luckily, the drive module is equipped with just the function you need:
from google.colab import drive
drive.flush_and_unmount()
You can reset your Colab backend by selecting the 'Reset all runtimes...' item from the Runtime menu.
Be aware, however, that this will discard your current backend.
Another solution for your problem could be to terminate your session and run your code (drive.mount()) again.
Steps:
1) Press "Additional connection options" button. Is the little sign button next to RAM and DISK
2) Select "Manage sessions"
3) Press the "Terminate" button
4) Run again your code (drive.mount()).
Now you will be asked to put your new key.
To force Colab to ask for a new key without waiting or resetting the runtime you can revoke the previous key.
To do this:
go to https://myaccount.google.com/permissions (or manually navigate to Security → Manage third-party access on your Google account page),
on the top right, select your profile image or initial, and then select the account whose drive you want do disconnect from Colab,
select Google Drive File Stream in the Google apps section, then select Remove access.
Executing drive.mount() will now ask for a new key.
Remount does not work when you have recently mounted and unmounted using flush_and_unmount(). The correct steps you should follow is (which worked for me at the time of posting):
After mounting using:
from google.colab import drive
drive.mount('/content/drive')
Unmount using: drive.flush_and_unmount() and you cannot see the 'drive/' folder but TRUST me you should use !rm -rf /content/drive before remounting the drive using:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)
And you will again get the authorise request for a new Gmail account.
You can terminate the session in Runtime -> manage session. That should do the work and you can remount the drive again.
Restarting runtimes and removing access did not help. I discovered that the notebook I was using created directories on the mountpoint:
from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
I had to first remove the subdirectories on the mountpoint. First ensure that your drive is not actually mounted!
!find /content/drive
/content/drive
/content/drive/My Drive
/content/drive/My Drive/Colab Notebooks
/content/drive/My Drive/Colab Notebooks/assignment4
/content/drive/My Drive/Colab Notebooks/assignment4/output_dir
/content/drive/My Drive/Colab Notebooks/assignment4/output_dir/2020-04-05_16:17:15
The files and directories above was accidentally created by the notebook before I've mounted the drive. Once you are sure (are you sure?) your drive is not mounted then delete the subdirectories.
!rm -rf /content/drive
After this, I was able to mount the drive.
Here is an explanation from their FAQs.
Why do Drive operations sometimes fail due to quota?
Google Drive enforces various limits, including per-user and per-file operation count and bandwidth quotas. Exceeding these limits will trigger Input/output error as above, and show a notification in the Colab UI. A typical cause is accessing a popular shared file, or accessing too many distinct files too quickly. Workarounds include:
Copy the file using drive.google.com and don't share it widely so that other users don't use up its limits.
Avoid making many small I/O reads, instead opting to copy data from Drive to the Colab VM in an archive format (e.g. .zip or.tar.gz files) and unarchive the data locally on the VM instead of in the mounted Drive directory.
Wait a day for quota limits to reset.
https://research.google.com/colaboratory/faq.html#drive-quota
Symptom
/content/drive gets auto-mounted without mounting it and not being asked for Enter your authorization code:.
Cached old state of the drive kept showing up.
The actual Google drive content did not show up.
Terminating, restarting, factory resetting revoking permissions, clear chrome cache did not work.
Flush and unmount google.colab.drive.flush_and_unmount() did not work.
Solution
Create a dummy file inside the mount point /content/drive.
Take a moment and make sure the content /content/drive is not the same with that in the Google drive UI.
Run rm -rf /content/drive.
Run google.colab.drive.flush_and_unmount()
From the menu Runtime -> Factory reset runtime.
Then re-run google.colab.drive.mount('/content/drive', force_remount=True) finally asked for Enter your authorization code.
The current code for the drive.mount() function is found at https://github.com/googlecolab/colabtools/blob/fe964e0e046c12394bae732eaaeda478bc5fa350/google/colab/drive.py
It is a wrapper for the drive executable found at /opt/google/drive/drive. I have found that executable accepts a flag authorize_new_user which can be used to force a reauthentication.
Copy and paste the contents of the drive.py file into your notebook. Then modify the call to d.sendline() currently on line 189 to look like this (note the addition of the authorize_new_user flag):
d.sendline(
('cat {fifo} | head -1 | ( {d}/drive '
'--features=max_parallel_push_task_instances:10,'
'max_operation_batch_size:15,opendir_timeout_ms:{timeout_ms},'
'virtual_folders:true '
'--authorize_new_user=True '
'--inet_family=' + inet_family + ' ' + metadata_auth_arg +
'--preferences=trusted_root_certs_file_path:'
'{d}/roots.pem,mount_point_path:{mnt} --console_auth 2>&1 '
'| grep --line-buffered -E "{oauth_prompt}|{problem_and_stopped}"; '
'echo "{drive_exited}"; ) &').format(
d=drive_dir,
timeout_ms=timeout_ms,
mnt=mountpoint,
fifo=fifo,
oauth_prompt=oauth_prompt,
problem_and_stopped=problem_and_stopped,
drive_exited=drive_exited))
Call either the drive module version of flush_and_unmount() or the one you pasted in, and then call your version of mount() to login as a different user!

Save files/pictures in Google Colaboratory

at the moment, I work with 400+ images and upload them with
from google.colab import files
uploaded = files.upload()
This one's working fine but I have to reupload all the images every time I leave my colaboratory. Pretty annoying because the upload takes like 5-10 minutes.
Any possibilities to prevent this? It seems like Colaboratory is saving the files only temporarily.
I have to use Google Colaboratory because I need their GPU.
Thanks in advance :)
As far as I know, there is no way to permanently store data on a Google Colab VM, but there are faster ways to upload data on Colab than files.upload().
For example you can upload your images on Google Drive once and then 1) mount Google Drive directly in your VM or 2) use PyDrive to download your images on your VM. Both of these options should be way faster than uploading your images from your local drive.
Mounting Drive in your VM
Mount Google Drive:
from google.colab import drive
drive.mount('/gdrive')
Print the contents of foo.txt located in the root directory of Drive:
with open('/gdrive/foo.txt') as f:
for line in f:
print(line)
Using PyDrive
Take a look at the first answer to this question.
First Of All Mount Your Google Drive:
# Load the Drive helper and mount
from google.colab import drive
# This will prompt for authorization.
drive.mount('/content/drive')
Result is :
Mounted at /content/drive
For Checking Directory Mounted Run this command:
# After executing the cell above, Drive
# files will be present in "/content/drive/My Drive".
!ls "/content/drive/My Drive"
Result is Something Like This:
07_structured_data.ipynb Sample Excel file.xlsx
BigQuery recipes script.ipynb
Colab Notebooks TFGan tutorial in Colab.txt
Copy of nima colab.ipynb to_upload (1).ipynb
created.txt to_upload (2).ipynb
Exported DataFrame sheet.gsheet to_upload (3).ipynb
foo.txt to_upload.ipynb
Pickle + Drive FUSE example.ipynb variables.pickle
Sample Excel file.gsheet