Test if notebook is running on Google Colab - google-colaboratory

How can I test if my notebook is running on Google Colab?
I need this test as obtaining / unzipping my training data is different if running on my laptop or on Colab.

Try importing google.colab
try:
import google.colab
IN_COLAB = True
except:
IN_COLAB = False
Or just check if it's in sys.modules
import sys
IN_COLAB = 'google.colab' in sys.modules

For environments using ipython
If you are sure that the script will be run using ipython which is the most typical usage, there is also the possibility to check the ipython interpreter used. I think it is a little bit more clear and you don't have to import any module.
if 'google.colab' in str(get_ipython()):
print('Running on CoLab')
else:
print('Not running on CoLab')
If you need to do it multiple times you might want to assign a variable so you don't have to repeat the str(get_ipython()).
RunningInCOLAB = 'google.colab' in str(get_ipython())
RunningInCOLAB is True if run in a Google Colab notebook.
For environments not using ipython
In this case you have to check ipython is used first, assuming that COLab will always use ipython.
RunningInCOLAB = 'google.colab' in str(get_ipython()) if hasattr(__builtins__,'__IPYTHON__') else False

you can check environment variable like this:
import os
if 'COLAB_GPU' in os.environ:
print("I'm running on Colab")
actually you can print out os.environ to check what's associated with colab and then check the key

Improved Solution for all Python environments
As none of the other answers given here worked for me, and I was not using iPython. I checked the environment variables they use in Colab and thus, the following is best for checking the environment:
import os
if os.getenv("COLAB_RELEASE_TAG"):
print("Running in Colab")
else:
print("NOT in Colab")

In a %%bash cell, use:
%%bash
[[ ! -e /colabtools ]] && exit # Continue only if running on Google Colab
# Do Colab-only stuff here
Or in Python equivalence
import os
if os.path.exists('/colabtools'):
# do stuff

Related

Geopandas and Spyder incompability

I want to run modules of Geopandas in Spyder. Apparently Geopandas is compatible with Sypder 4.2.5, (not with any higher version) and I could run code with this combination. However, in one of my code I had to use "input" command and the problem starts there. Sypder 4.2.5 crashes if I try to run input command. From the internet, I came to know that there was a bug in spyder and it was fixed in Spyder 5.3. Now I have no idea how to fix this problem. If I upgrade Spyder, Geopandas will not work. If I don't upgrade spyder, 'input' will not work.
I was trying to run something like the following code
def Coditions_R3():
print("This is R3")
def Coditions_R4():
print("This is R4")
System = input('Please Enter drone system: \n' )
print(System)
if (System == 'R3'):
Coditions_R3()
elif (System == 'R4'):
Coditions_R4()
Can anyone help? is there any way around to run geopandas with higher Spyder versions? or use something else in place of input?

Problem with connecting google Colab with google Cloud TPUs

I have this code which based on t5 notebook (https://colab.research.google.com/github/google-research/text-to-text-transfer-transformer/blob/master/notebooks/t5-trivia.ipynb)
FINETUNE_STEPS = 3000##param {type: "integer"}
model.finetune(
mixture_or_task_name="text_diacritization_short",
pretrained_model_dir=PRETRAINED_DIR,
finetune_steps=FINETUNE_STEPS
)
my code was working fine in 8 Augustus then something happened resulting of this error.
these two lines appeared when my model worked so i don't think they are the problem.
INFO:root:system_path_file_exists:gs://my_bucket/my_file/models/small/operative_config.gin
ERROR:root:Path not found: gs://my_bucket/my_file/models/small/operative_config.gin
Rest of the error.
From /usr/local/lib/python3.7/dist-packages/tensorflow/python/training/training_util.py:399: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:absl:Using an uncached FunctionDataset for training is not recommended since it often results in insufficient shuffling on restarts, resulting in overfitting. It is highly recommended that you cache this task before training with it or use a data source that supports lower-level shuffling (e.g., FileDataSource).
SimdMeshImpl ignoring devices ['', '', '', '', '', '', '', '']
Using default tf glorot_uniform_initializer for variable encoder/block_000/layer_000/SelfAttention/relative_attention_bias The initialzer will guess the input and output dimensions based on dimension order.
Using default tf glorot_uniform_initializer for variable decoder/block_000/layer_000/SelfAttention/relative_attention_bias The initialzer will guess the input and output dimensions based on dimension order.
From /usr/local/lib/python3.7/dist-packages/tensorflow/python/training/saver.py:1161: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
From /usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/tpu/tpu_estimator.py:758: Variable.load (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Prefer Variable.assign which has equivalent behavior in 2.X.
I changed the google cloud account and the Colab notebook to completely new gmail account, I think the problem is that something got updated in google Colab regarding connecting to Google Cloud TPUs.
Also, I can connect to my bucket normally using this code.
BASE_DIR = "gs://my_bucket/my_file" ##param { type: "string" }
if not BASE_DIR or BASE_DIR == "gs://":
raise ValueError("You must enter a BASE_DIR.")
DATA_DIR = os.path.join(BASE_DIR, "data")
FINETUNE_MODELS_DIR = os.path.join(BASE_DIR, "models")
ON_CLOUD = True
if ON_CLOUD:
print("Setting up GCS access...")
import tensorflow_gcs_config
from google.colab import auth
# Set credentials for GCS reading/writing from Colab and TPU.
TPU_TOPOLOGY = "v2-8"
try:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver() # TPU detection
TPU_ADDRESS = tpu.get_master()
print('Running on TPU:', TPU_ADDRESS)
except ValueError:
raise BaseException('ERROR: Not connected to a TPU runtime; please see the previous cell in this notebook for instructions!')
auth.authenticate_user()
tf.enable_eager_execution()
tf.config.experimental_connect_to_host(TPU_ADDRESS)
tensorflow_gcs_config.configure_gcs_from_colab_auth()
tf.disable_v2_behavior()
# Improve logging.
from contextlib import contextmanager
import logging as py_logging
if ON_CLOUD:
tf.get_logger().propagate = False
py_logging.root.setLevel('INFO')
#contextmanager
def tf_verbosity_level(level):
og_level = tf.logging.get_verbosity()
tf.logging.set_verbosity(level)
yield
tf.logging.set_verbosity(og_level)
it would be great if someone can help me I have been looking in the issue for a week and found nothing, is there any changes to how Google Colab works that I am not aware of.
Thanks in advance.

Testing a Jupyter Notebook

I am trying to come up with a method to test a number of Jupyter notebooks. A test should run when a new notebook is implemented in a Github branch and submitted for a pull request. The tests are not that complicated, they are mostly just testing if the notebook runs end-to-end and without any errors, and maybe a few asserts. However:
There are certain calls in some cells that need to be mocked, e.g. a call to download the data from a database.
There may be some magic cells in the notebooks which run a pip command or something else.
I am open to use any testing library, such as 'pytest' or unittest, although pytest is preferred.
I looked at a few libraries for testing notebooks such as nbmake, treon, and testbook, but I was unable to make them work. I also tried to convert the notebook to a python file, but the magic cells were converted to a get_ipython().run_cell_magic(...) call which became an issue, since pytest uses python and not ipython, and get_ipython() is only available in ipython.
So, I am wondering what is a good way to test jupyter notebooks with all of that in mind. Any help is appreciated.
One straightforward approach I've already used is to execute the entire notebook with nbconvert.
A notebook failed.ipynb raising an exception will result in a failed run thanks to the --execute option that tells nbconvert to execute the notebook prior to its conversion.
jupyter nbconvert --to notebook --execute failed.ipynb
# ...
# Exception: FAILED
echo $?
# 1
Another correct notebook passed.ipynb will result in a successful export.
jupyter nbconvert --to notebook --execute passed.ipynb
# [NbConvertApp] Converting notebook passed.ipynb to notebook
# [NbConvertApp] Writing 1172 bytes to passed.nbconvert.ipynb
echo $?
# 0
Cherry on the cake, you can do the same through the API and so wrap it in Pytest!
import nbformat
import pytest
from nbconvert.preprocessors import ExecutePreprocessor
#pytest.mark.parametrize("notebook", ["passed.ipynb", "failed.ipynb"])
def test_notebook_exec(notebook):
with open(notebook) as f:
nb = nbformat.read(f, as_version=4)
ep = ExecutePreprocessor(timeout=600, kernel_name='python3')
try:
assert ep.preprocess(nb) is not None, f"Got empty notebook for {notebook}"
except Exception:
assert False, f"Failed executing {notebook}"
Running the test gives.
pytest test_nbconv.py
# FAILED test_nbconv.py::test_notebook_exec[failed.ipynb] - AssertionError: Failed executing failed.ipynb
# PASSED test_nbconv.py::test_notebook_exec[passed.ipynb]
Notes
There is several output formats, I've used here notebook.
This doesn’t convert a notebook to a different format per se, instead it allows the running of nbconvert preprocessors on a notebook, and/or conversion to other notebook formats.
The python code example is just a quick draft it can be largely improved.
Here is my own solution using testbook. Let's say I have a notebook called my_notebook.ipynb with the following content:
The trick is to inject a cell before my call to bigquery.Client and mock it:
from testbook import testbook
#testbook('./my_notebook.ipynb')
def test_get_details(tb):
tb.inject(
"""
import mock
mock_client = mock.MagicMock()
mock_df = pd.DataFrame()
mock_df['week'] = range(10)
mock_df['count'] = 5
p1 = mock.patch.object(bigquery, 'Client', return_value=mock_client)
mock_client.query().result().to_dataframe.return_value = mock_df
p1.start()
""",
before=2,
run=False
)
tb.execute()
dataframe = tb.get('dataframe')
assert dataframe.shape == (10, 2)
x = tb.get('x')
assert x == 7

Having problems declaring SUMO_HOME

I'm trying to run a test python code to use the traci library and it is returning "please declare environment SUMO_HOME".
I'm on Ubuntu 18.4.2 and Sumo 0.32.0.I solved this problem before by running
export SUMO_HOME=/home/gustavo/Downloads/sumo-0.32.0/tools/
,but this time it couldn't solve the problem. So I tried implementing a line inside the python file using the os library giving the same command but from the code itself:
os.system("export SUMO_HOME=/home/gustavo/Downloads/sumo-0.32.0/tool/")
And it also didn't work, so came here to ask for help. May any of you help me, please?
import os
import sys
import optparse
os.system("export SUMO_HOME=/home/gustavo/Downloads/sumo-0.32.0/tool/")
# we need to import some python modules from the $SUMO_HOME/tools directory
if 'SUMO_HOME' in os.environ:
tools = os.path.join(os.environ['SUMO_HOME=/home/gustavo/Downloads/sumo-0.32.0/tools/'], 'tools')
sys.path.append(tools)
else:
sys.exit("please declare environment variable 'SUMO_HOME'")
from sumolib import checkBinary # Checks for the binary in environ vars
import traci
def get_options():
opt_parser = optparse.OptionParser()
opt_parser.add_option("--nogui", action="store_true",
default=False, help="run the commandline version of sumo")
options, args = opt_parser.parse_args()
return options
# contains TraCI control loop
def run():
step = 0
while traci.simulation.getMinExpectedNumber() > 0:
traci.simulationStep()
print(step)
step += 1
traci.close()
sys.stdout.flush()
# main entry point
if __name__ == "__main__":
options = get_options()
# check binary
if options.nogui:
sumoBinary = checkBinary('sumo')
else:
sumoBinary = checkBinary('sumo-gui')
# traci starts sumo as a subprocess and then this script connects and runs
traci.start([sumoBinary, "-c", "demo.sumocfg",
"--tripinfo-output", "tripinfo.xml"])
run()
I expected for the steps to appear on the terminal.
The correct location is probably
export SUMO_HOME=/home/gustavo/Downloads/sumo-0.32.0
without the tools or tool suffix. It will not work from inside the python script with os.system but you could modify os.environ directly.
Furthermore you mixed up the call to os.environ in the script. It should read:
tools = os.path.join(os.environ['SUMO_HOME'], 'tools')
I swapped the if else part for another code :
try:
sys.path.append("/home/gustavo/Downloads/sumo-0.32.0/tools")
from sumolib import checkBinary
except ImportError:
sys.exit("please declare environment variable 'SUMO_HOME' as the root directory of your sumo installation (it should contain folders 'bin', 'tools' and 'docs')")
It solved the problem

Running Tensorflow on JupyterNotebook instead of on Terminal commands

I wish to run some Tensorflow code on JupyterNotebook.
If run it on terminal, then the link above gives instructions like this:
python src/validate_on_lfw.py ~/datasets/lfw/lfw_mtcnnpy_160 ~/models/facenet/20170512-110547
Question: how do I run it on Jupyter notebook ? Thanks
e.g.,
# Load the model
facenet.load_model(args.model)
Simply replace args.model with ~/models/facenet/20170512-110547
# Load the model
facenet.load_model('~/models/facenet/20170512-110547')
will give error
usage: ipykernel_launcher.py [-h] [--lfw_batch_size LFW_BATCH_SIZE]
[--image_size IMAGE_SIZE] [--lfw_pairs LFW_PAIRS]
[--lfw_file_ext {jpg,png}]
[--lfw_nrof_folds LFW_NROF_FOLDS]
lfw_dir model
ipykernel_launcher.py: error: too few arguments
sys.argv
Out[5]:
['/anaconda/envs/tensorflow/lib/python2.7/site-packages/ipykernel_launcher.py',
'-f',
'/Users/my_name/Library/Jupyter/runtime/kernel-770c12c9-8fbe-44f7-91dd-4b0a5c5d7537.json']
Ok, simple solution...
Simply run it on Terminal as the given GitHub suggested and in the mean time print out the sys.argv on terminal like this
sys.argv = ['src/validate_on_lfw.py', '/Users/../datasets/lfw/lfw_mtcnnpy_160', '/Users/../models/facenet/20170512-110547']
Then use these values of sys.argv in JupyterNotebook in def parse_arguments(argv) as default values, and it worked