I'm using Google Colab to learn and tinker with ML and TensorFlow. I had a huge dataset in multiple multi-part rar files. I tried simply
!unrar e zip-file 'extdir'
but after successfully extracting a couple of archives it starts throwing up errors, specifically input/output errors.
Does google block you after a couple GBs unrar-ed?
I have already tried resetting the runtime environment and changing the runtime from Py2 to Py3 but nothing made a difference
True, it doesn't work after a couple of runs.
Try unrar-free, the free version of unrar.
Checkout the help manual below:
https://helpmanual.io/help/unrar-free/
No Google doesn't block you for extracting large files. Also, unrar-free gave the same error as before. So, you can install p7zip and extract rarv5. Or you can also use 7z. This solved the exact same problem that I was also facing. (I had a rar file ~20 GiB).
!apt install p7zip-full p7zip-rar
or
!7z e zip-file
Related
I was trying out dlib‘s deep learning-based face detection MMOD, and it worked perfectly fine without any errors. After the weekend, I rerun my google colab, and I get the following error:
RuntimeError: Error while calling cudnnConvolutionBiasActivationForward( context(), &alpha1, descriptor(data), data.device(), (const cudnnFilterDescriptor_t)filter_handle, filters.device(), (const cudnnConvolutionDescriptor_t)conv_handle, (cudnnConvolutionFwdAlgo_t)forward_algo, forward_workspace, forward_workspace_size_in_bytes, &alpha2, out_desc, out, descriptor(biases), biases.device(), identity_activation_descriptor(), out_desc, out) in file /tmp/pip-install-fdw8qrx_/dlib_e3176ea453c4478d8dbecc372b81297e/dlib/cuda/cudnn_dlibapi.cpp:1237. code: 9, reason: CUDNN_STATUS_NOT_SUPPORTED
literally same code previously saved in GitHub, and now in google colab
Any ideas about what could have happened over the weekend, and how to fix it? Thank you!
So after I tried EVERYTHING I could come up with (trying the code on a different machine, on a different platform, check if there were any library updates), I went through my github committed version, and realized, that the dlib library was updated, but not announced anywhere...
So yeah, note for the future self: always include the .version afterimporting the tools, might save DAYS of trying to figure out what on earth happened
I am testing around with EDK2 by Tianocore (https://github.com/tianocore/edk2) and I can build BIOS images as well as UEFI Applications and drivers but when it comes to building a UEFI capsule, I am not sure how to go about doing this.
https://uefi.org/sites/default/files/resources/UEFI%20Fall%202018%20Intel%20UEFI%20Capsules.pdf this points to some ideas but I am not sure the exact path to take here.
I see two possibilities:
https://github.com/tianocore/edk2/tree/master/FmpDevicePkg this is the package mentioned in the PDF link above and the PDF also mentions an integrated build pipeline for making a capsule. It also mentions a standalone python script which is option two.
https://github.com/tianocore/edk2/tree/c640186ec8aae6164123ee38de6409aed69eab12/BaseTools/Source/Python/GenFds there are standalone scripts to make these images and artifacts like capsules and headers at this location but I am unsure if these are intended to be used as is or only as a part of a larger build pipeline.
My end goal here is to produce a UEFI capsule and place UEFI drivers inside it as the payload so any tips or help would be appreciated.
How can I use the earlier version of Python i.e version 2.x?
Under the 'change runtime' option - I can see the option for selecting hardware accelerator.
You can use these 2 shortcuts to create a Python 2 Colab.
bit.ly/colabpy2
colab.to/py2
They will forward to this URL.
https://colab.research.google.com/notebook#create=true&language=python2
Update 2022
Now the Python 2 kernel is removed. The simple method above no longer works. You may try the difficult method I used with Python 3.10 if you really must.
Python 2 reached its end of life on January 1, 2020, and is no longer supported by the Python developer community. Because of that, Colab is in the process of deprecating Python 2 runtimes; see https://research.google.com/colaboratory/faq.html#python-2-deprecation for details.
Presently, there is no way to change to Python 2 via the Colab UI, but existing Python 2 notebooks will still connect to Python 2 for the time being. So, for example, if you open a notebook like this one: https://colab.research.google.com/gist/jakevdp/de56c474b41add4540deba2426534a49/empty-py2.ipynb and execute code, it will execute in Python 2 for now. I would suggest following that link, and then choosing File->Save A Copy In Drive to get your own copy of an empty Python 2 notebook.
But please be aware that at some point in the future, Python 2 runtimes will be entirely unavailable in Colab, even for existing notebooks that specify Python 2 in their metadata.
Python2 is deprecated now and is no longer available as a runtime in colab
If you are running some python program, you can use
!python2.7 your_program.py instead of !python your_program.py
But if you want to execute some python2.7 code, I think that as mentioned in the previous answers, it is not possible.
In Q1 2019, I ran some experiments and I noticed that Colab notebooks with the same Runtime type (None/GPU/TPU) would always share the same Runtime (i.e., the same VM). For example, I could write a file to disk in one Colab notebook and read it in another Colab notebook, as long as both notebooks had the same Runtime type.
However, I tried again today (October 2019) and it now seems that each Colab notebook gets its own dedicated Runtime.
My questions are:
When did this change happen? Was this change announced anywhere?
Is this always true now? Will Runtimes sometimes be shared and sometimes not?
What is the recommended way to communicate between two Colab notebooks? I'm guessing Google Drive?
Thanks
Distinct notebooks are indeed isolated from one another. Isolation isn't configurable.
For file sharing, I think you're right that Drive is the best bet as described in the docs:
https://colab.research.google.com/notebooks/io.ipynb#scrollTo=u22w3BFiOveA
I have found no easy way of running multiple notebooks within the same runtime. That being said, I have no idea how this effects the quota. On my real computer, I'd limit GPU memory per script and run multiple python threads. They don't let you do this, and I think if you do not use the whole amount of RAM, they should not treat that the same as if you had used all of that GPU for 12 or 24 hrs. They can pool your tasks with other users.
I used files.upload() on three hdf5 data files on Google Colab, in order to train some TensorFlow models, and it took a few minutes to complete.
Everything ran smooth with a few minor modifications from the local Jupyter notebook. However, when I proceed to change the runtime from "None" to "GPU", no files previously uploaded are present in the home folder. I just had to re-upload them. Going back to the "None" runtime showed that the files were still there.
Is there a convenient way to copy+paste the data from one runtime to another?
Thanks a lot.
I don't think there is anyway to directly copy data from a CPU instance to a GPU one.
So, you probably need to copy it to Google Drive (or mount it with ocamlfuse).
Another way is to use git to add, and push from one. Then clone from the other.