There is a way to read the images for my convolutional neural network directly from my desktop? - tensorflow

I'm training a Convolutional Neural Network using Google-Colaboratory. I have my data (images) stored in Google Drive and I'm able to use it correctly. However sometimes the process to read the images is too slow and does not work (other times the process is faster and I have no problem reading the images). In order to read the images from Google Drive I use:
from google.colab import drive
drive.mount('/content/drive')
!unzip -u "/content/drive/My Drive/the folder/files.zip"
IMAGE_PATH = '/content/drive/My Drive/the folder'
file_paths = glob.glob(path.join(IMAGE_PATH, '*.png'))
and sometimes works and other times not or it is too slow :).
Either way I would like to read my data from a folder on my desktop without using google drive but I'm not able to do this.
I'm trying the following:
IMAGE_PATH = 'C:/Users/path/to/my/folder'
file_paths = glob.glob(path.join(IMAGE_PATH, '*.png'))
But I get an error saying that the directory/file does not exist.

Google Colab cannot directly access our local machine dataset because it runs on a separate virtual machine on the cloud. We need to upload the dataset into Google Drive then we can load it into Google Colab’s runtime for model building.
For that you need to follow the steps given below:
Create a zip file of your large dataset and then upload this file in your Google Drive.
Now, open the Google Colab with the same google id and mount the Google Drive using the below code and authorize to access the drive:
from google.colab import drive
drive.mount('/content/drive')
Your uploaded zip file will be available in the Google Colab mounted drive /drive/MyDrive/ in left pane.
To read the dataset into the Google Colab, you need to unzip the folder and extract its contents into the /tmp folder using the code below.
import zipfile
import os
zip_ref = zipfile.ZipFile('/content/drive/MyDrive/train.zip', 'r') #Opens the zip file in read mode
zip_ref.extractall('/tmp') #Extracts the files into the /tmp folder
zip_ref.close()
You can check the extracted file in /drive/train folder in left pane.
Now finally you need to join the path of your dataset to use it in the Google Colab's runtime environment.
train_dataset = os.path.join('/tmp/train/') # dataset

Related

How to upload large files from local pc to DBFS?

I am trying to learn spark SQL in databricks and want to work with the Yelp dataset; however, the file is too large to upload to DBFS from UI. Thanks, Filip
There are several approaches to that:
Use Databricks CLI's dbfs command to upload local data to DBFS.
Download dataset directly from notebook, for example by using %sh wget URL, and unpacking the archive to DBFS (either by using /dbfs/path/... as destination, or using dbutils.fs.cp command to copy files from driver node to DBFS)
Upload files to AWS S3, Azure Data Lake Storage, Google Storage or something like, and accessing data there.
Upload the file you want to load in Databricks to google drive
from urllib.request import urlopen
from shutil import copyfileobj
my_url = 'paste your url here'
my_filename = 'give your filename'
file_path = '/FileStore/tables' # location at which you want to move the downloaded file
# Downloading the file from google drive to Databrick
with urlopen(my_url) as in_stream, open(my_filename, 'wb') as out_file:
copyfileobj(in_stream, out_file)
# check where the file has download
# in my case it is
display(dbutils.fs.ls('file:/databricks/driver'))
# moving the file to desired location
# dbutils.fs.mv(downloaded_location, desired_location)
dbutils.fs.mv("file:/databricks/driver/my_file", file_path)
I hope this helps

VideoColorizerColab.ipynb doesn't communicate with google drive links

DownloadError: ERROR: unable to download video data: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Found
Since yesterday I can't use my google drive files shared links with VideoColorizerColab.ipynb. I get this error above all the time I try to colorize my videos.
Does anyone know what's going on? Thank you, Géza.
You might want to try mounting your Google Drive to your colab and copying the video to the colab rather than using the link to download the video.
The code to mount your google drive to colab is
from google.colab import drive
drive.mount('/content/drive')
After this step, you can use all the content in your Drive as folders in your colab. You can see them in the Files section on the left side of your notebook. You can select a file, right-click and copy path and use the path to do any operation on the file.
This is an example of copying
!cp -r /content/gdrive/My\ Drive/headTrainingDatastructure/eval /content/models/research/object_detection/

how to load 30GB of datasets loading in google colab

i have a dataset of 30GB which I need to upload to google colab. What is the process to upload it?
It depends on what do you mean by "Have a 30GB dataset". If this dataset is on your local machine, then you need to:
Upload your dataset to Google Drive first
Then mount your Google Drive to your colab-notebook.
If you have the dataset on a server online, then you need to:
Mount your google drive to your notebook
Then, download it to your google drive directly
You can use this cod to mount your google-drive to your notebook:
import os
from google.colab import drive
drive.mount('/content/gdrive')
ROOT = "/content/gdrive/My Drive/"
os.chdir(ROOT)
If your data is on a server, then you can download it directly by running the following code in a notebook cell.
!wget [dataset_url]
If your problem is not enough space, you can change to a GPU runtime to get 350 GB space.
MENU > Runtime > Change runtime type > Hardware accelerator = GPU
The process is the same as #Anwarvic 's answer.
you can get more space by changing GPU to TPU from
MENU > Runtime > Change runtime type > Hardware accelerator = TPU

Accessing files in Google Colab

I am new to Google Colab. I have used the following code to download a data set from Kaggle:
!pip install kaggle
import os
os.environ['KAGGLE_USERNAME'] = "xxxxxxxxx"
os.environ['KAGGLE_KEY'] = "xxxxxxxxxxxxxxxxxxxxxxxxxx"
!kaggle competitions download -c dogs-vs-cats-redux-kernels-edition
os.chdir('/content/')
#dowloaded data in this directory
I can access all the data now, but where is this data stored? In my Google Drive? But I can't find it. What if I want to access this same data from a different notebook running in Colab? Both notebooks are stored in my Google Drive, but seem to have their own different '/content/' folders.
to store the data in google drive you have to link the drive first using :
from google.colab import drive
drive.mount('/content/drive')
then you can access the drive by navigating through "/content/drive/My Drive/"
so you can download the data to your drive using the following command :
!kaggle competitions download -p "/content/drive/My Drive/" -c dogs-vs-cats-redux-kernels-edition

Save files/pictures in Google Colaboratory

at the moment, I work with 400+ images and upload them with
from google.colab import files
uploaded = files.upload()
This one's working fine but I have to reupload all the images every time I leave my colaboratory. Pretty annoying because the upload takes like 5-10 minutes.
Any possibilities to prevent this? It seems like Colaboratory is saving the files only temporarily.
I have to use Google Colaboratory because I need their GPU.
Thanks in advance :)
As far as I know, there is no way to permanently store data on a Google Colab VM, but there are faster ways to upload data on Colab than files.upload().
For example you can upload your images on Google Drive once and then 1) mount Google Drive directly in your VM or 2) use PyDrive to download your images on your VM. Both of these options should be way faster than uploading your images from your local drive.
Mounting Drive in your VM
Mount Google Drive:
from google.colab import drive
drive.mount('/gdrive')
Print the contents of foo.txt located in the root directory of Drive:
with open('/gdrive/foo.txt') as f:
for line in f:
print(line)
Using PyDrive
Take a look at the first answer to this question.
First Of All Mount Your Google Drive:
# Load the Drive helper and mount
from google.colab import drive
# This will prompt for authorization.
drive.mount('/content/drive')
Result is :
Mounted at /content/drive
For Checking Directory Mounted Run this command:
# After executing the cell above, Drive
# files will be present in "/content/drive/My Drive".
!ls "/content/drive/My Drive"
Result is Something Like This:
07_structured_data.ipynb Sample Excel file.xlsx
BigQuery recipes script.ipynb
Colab Notebooks TFGan tutorial in Colab.txt
Copy of nima colab.ipynb to_upload (1).ipynb
created.txt to_upload (2).ipynb
Exported DataFrame sheet.gsheet to_upload (3).ipynb
foo.txt to_upload.ipynb
Pickle + Drive FUSE example.ipynb variables.pickle
Sample Excel file.gsheet