Data that should be saved in drive during run is gone - google-colaboratory

I have a colab pro plus subscription, and I ran a python code that runs a network using pytorch.
Until now, I succeeded in training my network and using my Google drive to save checkpoints and such.
But now I had a run that ran for around 16 hours, and no checkpoint or any other data was saved - even though the logs clearly show that I have saved the data and even evaluated the metrics on saved data.
Maybe the data was saved on a different folder somehow?
I looked in the drive activity and I could not see any data that was saved.
Has anyone ran to this before?
Any help would be appreciated.

Related

How to save tensorflow models in RAMDisk?

In my original python code, there is a frequent restore of the ckpt model file. It takes too much time to read the checkpoints again and again. So I decided to save the model in the memory. A simple way is to create a RAMDisk and save the model in that disk. However, something unexpected happens.
I deployed 1G of RAMDisk according to the tutorial How to Create RAM Disk in Windows 10 for Super-Fast Read and Write Speeds. My system is windows 11.
I made two attempts: In the first one, I copied my code to the RAMDisk E: and used tf.train.Saver().save(self.sess,'./') to save the model, but it reports that UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb4 in position 114: invalid start byte. However, if I put the code on other normal folders, it runs successfully.
In the second attempt, I put the code under D: and modified the line as tf.train.Saver().save(self.sess,'E:\\'), and it reports that cannot create directory E: Permission Denied. Obviously, E:\ is not a directory to create. So I don't know how to handle this.
Your jupyter/python environment cannot go beyond the directory from which jupyter/python is started from and that's why you get a permission denied error.
However, you can run shell commands from the jupyter notebook. If your user has write access to your destination, you can do the following.
model.save("my_model") # This will save the model to the current directory.
!mv "my_model" "E:\my_model" # This will move the model from the current directory to your required directory.
On a side note, I when searching for tf.train.Saver().save(), I get this page as the only relevant result, which says it is used for saving checkpoints and not model. Also they recommend switching to the newer tf.train.Checkpoint or tf.keras.Model.save_weights. None the less, the above method should work as expected.

Session stopped without entry in log google colab

I'm running a CNN Deep Learning Classification model on colab pro+. However, without reason, session is interrupted, and no log is presented in the app.log file.
enter image description here
As can be seen in the image, It's 30th november and the last entry in the app.log was yesterday. The program has just stopped running without reason.
Can you help me?
I've sent a question to colab github acccount.

kaggle directly download input data from copied kernel

How can I download all the input data from a kaggle kernel? For example this kernel: https://www.kaggle.com/davidmezzetti/cord-19-study-metadata-export.
Once you make a copy and have the option to edit, you have the ability to run the notebook and make changes.
One thing I have noticed is that anything that goes in the output directory is provided with an option of a download button next to the file icon. So I see that I can surely just read each and every file and write to the output but it seems like a waste.
Am I missing something here?
The notebook you list contains two data sources;
another notebook (https://www.kaggle.com/davidmezzetti/cord-19-analysis-with-sentence-embeddings)
and a dataset (https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge)
You can use Kaggle's API to retrieve a kernel's output:
kaggle kernels output davidmezzetti/cord-19-analysis-with-sentence-embeddings
And to download dataset files:
kaggle datasets download allen-institute-for-ai/CORD-19-research-challenge

Does google colab permanently change file

I am doing some data pre-processing on Google Colab and just wondering how it works with manipulating dataset. For example R does not change the original dataset until you use write.csv to export the changed dataset. Does it work similarly in colab? Thank you!
Until you explicitly save your changed data, e.g. using df.to_csv to the same file you read from, your changed dataset is not saved.
You must remember that due to inactivity (up to an hour or so), you colab session might expire and all progress be lost.
Update
To download a model, dataset or a big file from Google Drive, gdown command is already available
!gdown https://drive.google.com/uc?id=FILE_ID
Download your code from GitHub and run predictions using the model you already downloaded
!git clone https://USERNAME:PASSWORD#github.com/username/project.git
Write ! before a line of your code in colab and it would be treated as bash command. You can download files form internet using wget for example
!wget file_url
You can commit and push your updated code to GitHub etc. And updated dataset / model to Google Drive or Dropbox.

How to autosave notebooks in Google colab?

I was recently working in a notebook on Google Colab and my computer ran out of battery and died. All the progress I had made was not saved anywhere!
I'm very used to having jupyter notebooks, which saves my files pretty much every time I execute a cell.
Is there a way to have an equivalent feature in Google Colab?
Autosave is already implemented in Google Colab, but there is a certain delay between the moment you execute a cell and when the save occurs.
You can try this yourself by going into File>Revision History, executing a cell, and waiting for the list to refresh.
That being said, I have also experienced loss of data in the past, which I can't explain. It might be a glitch.
As a good practice, I try to save every time I remember.
Good luck.
Autosave every 60 seconds by running this "magic command" into a new code cell :
%autosave 60
Colab will confirm it when you run the cell with printing : "Autosave changes every 60 seconds"
To display the list of all magic commands you can use the command :
%lsmagic
Additionally, you can call the Quick Reference Guide, describing all the magic commands and what they do using the command :
%quickref
Enjoy!