How to pass commandline argument when running a python code in google colab?
I have written a code which takes a file as input via sys.argv[]. How do I do this?
As far as I know, there is no special way to pass command line arguments to python code. This is a working code sample I use to when creating tfrecords.
!python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=data/test.record --image_dir=images/
I don't see any difference between the regular command line python argument passing and the colab. Please add more code to your question to get better help.
I tried this in a google colab notebook
import sys
sys.argv[0] = "first_arg" # this is to assign the first command line argument
sys.argv[1] = "second_arg" # This line to assign the second arg for example
And it worked for me.
So if you want to run a python code that works like this:
!python test.py --image_folder '/content/image' --workers 2 --Prediction CTC --rgb True
You have to open test.py or your file with editor then you will find line inside the file similer like this:
parser = argparse.ArgumentParser()
parser.add_argument('--image_folder', required=True, help='path to image_folder')
parser.add_argument('--workers', type=int, default=1, help='number of workers')
parser.add_argument('--Prediction', type=str, default='CTC', help='Prediction stage.')
parser.add_argument('--rgb', action='store_true', help='use rgb input')
args = parser.parse_args()
But this will give you " Error SystemExit: 2 "
Then you have to change like this:
parser = argparse.ArgumentParser()
parser.add_argument('--image_folder', required=False, default='/content/image', help='path to image_folder')
parser.add_argument('--workers', type=int, default=2, help='number of workers')
parser.add_argument('--Prediction', type=str, default='CTC', help='Prediction stage.')
parser.add_argument('--rgb', action='store_false', help='use rgb input')
parser.add_argument("-f", "--file", required=False)
args = parser.parse_args()
You must add in the end of " parser.add_argument " line:
parser.add_argument("-f", "--file", required=False)
Then you can call commandline argument like this:
image = args.image_path
Or
img = Image.open(args.image_path)
workers = args.workers
But if your last line like this:
args = vars(ap.parse_args())
Then you have to call it like this:
image = args["image_path"]
Or
img = Image.open(args["image_path"])
workers = args["workers"]
#Note ( action='store_false' ) will default to ( False )
Likewise, ( action='store_false' ) will default to ( True )
Tested with Google colab
I made a bioinformatic tool locally in my machine to parse Uniprot big data files of proteins.
The tool I made needs the passing of different parameters using command line arguments. After the tool was working locally, I upload data files and python source files to my google drive.
I did not make any changes to my files. I just run directly the following command in google colab:
!python3 drive/MyDrive/uniprot/uniprot_select.py FIELDS "ID,OS,SQ" FROM drive/MyDrive/data/uniprot.dat WHERE "SQ#EYDRRR" FASTA
It works perfectly!
No need of special parsing, no need to additional imports. All the work you normally do locally in your machine, can be executed without changes.
Related
I am trying to come up with a method to test a number of Jupyter notebooks. A test should run when a new notebook is implemented in a Github branch and submitted for a pull request. The tests are not that complicated, they are mostly just testing if the notebook runs end-to-end and without any errors, and maybe a few asserts. However:
There are certain calls in some cells that need to be mocked, e.g. a call to download the data from a database.
There may be some magic cells in the notebooks which run a pip command or something else.
I am open to use any testing library, such as 'pytest' or unittest, although pytest is preferred.
I looked at a few libraries for testing notebooks such as nbmake, treon, and testbook, but I was unable to make them work. I also tried to convert the notebook to a python file, but the magic cells were converted to a get_ipython().run_cell_magic(...) call which became an issue, since pytest uses python and not ipython, and get_ipython() is only available in ipython.
So, I am wondering what is a good way to test jupyter notebooks with all of that in mind. Any help is appreciated.
One straightforward approach I've already used is to execute the entire notebook with nbconvert.
A notebook failed.ipynb raising an exception will result in a failed run thanks to the --execute option that tells nbconvert to execute the notebook prior to its conversion.
jupyter nbconvert --to notebook --execute failed.ipynb
# ...
# Exception: FAILED
echo $?
# 1
Another correct notebook passed.ipynb will result in a successful export.
jupyter nbconvert --to notebook --execute passed.ipynb
# [NbConvertApp] Converting notebook passed.ipynb to notebook
# [NbConvertApp] Writing 1172 bytes to passed.nbconvert.ipynb
echo $?
# 0
Cherry on the cake, you can do the same through the API and so wrap it in Pytest!
import nbformat
import pytest
from nbconvert.preprocessors import ExecutePreprocessor
#pytest.mark.parametrize("notebook", ["passed.ipynb", "failed.ipynb"])
def test_notebook_exec(notebook):
with open(notebook) as f:
nb = nbformat.read(f, as_version=4)
ep = ExecutePreprocessor(timeout=600, kernel_name='python3')
try:
assert ep.preprocess(nb) is not None, f"Got empty notebook for {notebook}"
except Exception:
assert False, f"Failed executing {notebook}"
Running the test gives.
pytest test_nbconv.py
# FAILED test_nbconv.py::test_notebook_exec[failed.ipynb] - AssertionError: Failed executing failed.ipynb
# PASSED test_nbconv.py::test_notebook_exec[passed.ipynb]
Notes
There is several output formats, I've used here notebook.
This doesn’t convert a notebook to a different format per se, instead it allows the running of nbconvert preprocessors on a notebook, and/or conversion to other notebook formats.
The python code example is just a quick draft it can be largely improved.
Here is my own solution using testbook. Let's say I have a notebook called my_notebook.ipynb with the following content:
The trick is to inject a cell before my call to bigquery.Client and mock it:
from testbook import testbook
#testbook('./my_notebook.ipynb')
def test_get_details(tb):
tb.inject(
"""
import mock
mock_client = mock.MagicMock()
mock_df = pd.DataFrame()
mock_df['week'] = range(10)
mock_df['count'] = 5
p1 = mock.patch.object(bigquery, 'Client', return_value=mock_client)
mock_client.query().result().to_dataframe.return_value = mock_df
p1.start()
""",
before=2,
run=False
)
tb.execute()
dataframe = tb.get('dataframe')
assert dataframe.shape == (10, 2)
x = tb.get('x')
assert x == 7
I am a university student using my university's computing cluster.
I installed Tex Live to my home directory at ~/.local/texlive/. I have a file called mplrc. The MATPLOTLIBRC environment variable is set to the mplrc file. The mplrc file contains the following lines
backend: pgf
pgf.rcfonts: false
pgf.texsystem: pdflatex
pgf.preamble: \input{mpl_settings.tex}
text.usetex: true
font.family: serif
font.size: 12
The mpl_settings.tex file is in the same directory as the mplrc file and contains the following
\usepackage{amsmath}
\usepackage[T1]{fontenc}
\usepackage{gensymb}
\usepackage{lmodern}
\usepackage{siunitx}
On the cluster I am using, I must submit a SLURM job to run the Jupyter notebook. The example code I am trying to run within the notebook is
formula = (
r'$\displaystyle '
r'N = \int_{E_\text{min}}^{E_\text{max}} '
r'\int_0^A'
r'\int_{t_\text{min}}^{t_\text{max}} '
r'\Phi_0 \left(\frac{E}{\SI{1}{\GeV}}\right)^{\!\!-γ}'
r' \, \symup{d}A \, \symup{d}t \, \symup{d}E'
r'$'
)
def power_law_spectrum(energy, normalisation, spectral_index):
return normalisation * energy**(-spectral_index)
bin_edges = np.logspace(2, 5, 15)
bin_centers = 0.5 * (bin_edges[:-1] + bin_edges[1:])
y = power_law_spectrum(bin_centers, 1e-5, 2.5)
relative_error = np.random.normal(1, 0.2, size=len(y))
y_with_err = relative_error * y
fig, ax = plt.subplots()
ax.errorbar(
np.log10(bin_centers),
y_with_err,
xerr=[
np.log10(bin_centers) - np.log10(bin_edges[:-1]),
np.log10(bin_edges[1:]) - np.log10(bin_centers)
],
yerr=0.5 * y_with_err,
linestyle='',
)
ax.text(0.1, 0.1, formula, transform=plt.gca().transAxes)
ax.set_yscale('log')
fig.tight_layout(pad=0)
plt.show()
This generates an enormous error message, but the root of it is
RuntimeError: latex was not able to process the following string:
b'lp'
However, underneath that, I see what I think is the real problem
! LaTeX Error: File `article.cls' not found.
I've set my PATH so that it finds the right latex command, but what else needs to be set in order to find the article.cls file? It seems like it's something particular to the Python notebook. When running kpsewhich article.cls in a terminal within the Jupyterlab interface, the file gets found. But trying ! kpsewhich article.cls or subprocess.run(['kpsewhich', 'article.cls']) within the Python notebook does not find the file.
I figured it out. I forgot I had run a section of code which set
TEXINPUTS=/path/to/some/directory
Looks like I missed a : in my TEXINPUTS, so TeX was only looking in /path/to/some/directory
The solution was to have
TEXINPUTS=/path/to/some/directory:
That way it looked in my current directory, but also continued looking elsewhere.
I wish to run some Tensorflow code on JupyterNotebook.
If run it on terminal, then the link above gives instructions like this:
python src/validate_on_lfw.py ~/datasets/lfw/lfw_mtcnnpy_160 ~/models/facenet/20170512-110547
Question: how do I run it on Jupyter notebook ? Thanks
e.g.,
# Load the model
facenet.load_model(args.model)
Simply replace args.model with ~/models/facenet/20170512-110547
# Load the model
facenet.load_model('~/models/facenet/20170512-110547')
will give error
usage: ipykernel_launcher.py [-h] [--lfw_batch_size LFW_BATCH_SIZE]
[--image_size IMAGE_SIZE] [--lfw_pairs LFW_PAIRS]
[--lfw_file_ext {jpg,png}]
[--lfw_nrof_folds LFW_NROF_FOLDS]
lfw_dir model
ipykernel_launcher.py: error: too few arguments
sys.argv
Out[5]:
['/anaconda/envs/tensorflow/lib/python2.7/site-packages/ipykernel_launcher.py',
'-f',
'/Users/my_name/Library/Jupyter/runtime/kernel-770c12c9-8fbe-44f7-91dd-4b0a5c5d7537.json']
Ok, simple solution...
Simply run it on Terminal as the given GitHub suggested and in the mean time print out the sys.argv on terminal like this
sys.argv = ['src/validate_on_lfw.py', '/Users/../datasets/lfw/lfw_mtcnnpy_160', '/Users/../models/facenet/20170512-110547']
Then use these values of sys.argv in JupyterNotebook in def parse_arguments(argv) as default values, and it worked
When I pass command line parameters to my Python program via PyCharm, I find them as usual in sys.argv, but tf.app.flags.FLAGS instead reports empty strings.
If I run the same program outside PyCharm (from command line), then tf.app.flags.FLAGS reports the expected command line parameter values.
See screenshot below to see how I pass command line parameters in PyCharm.
Here a short program that reproduces the issue:
import tensorflow as tf
from sys import argv
flags = tf.app.flags
FLAGS = flags.FLAGS
# command line flags
flags.DEFINE_string('input', '', "input file (.p)")
def main(_):
print('Parameters', argv)
print('input', FLAGS.input or 'is empty')
# parses flags and calls the `main` function above
if __name__ == '__main__':
tf.app.run()
If I run it from command line, I get the expected output:
python3 issue.py --input my_data.p
Parameters ['issue.py', '--input', 'my_data.p']
input my_data.p
But from PyCharm, input is set to an empty string:
/usr/bin/python3.5 /home/fanta/workspace/CarND-Transfer-Learning-Lab/issue.py "--input my_data.p"
Parameters ['/home/fanta/workspace/CarND-Transfer-Learning-Lab/issue.py', '--input my_data.p']
input is empty
How can I get input to contain the command line parameter, using Tensorflow's tf.app.flags.FLAGS and PyCharm?
Don't put the script parameters into quotation marks "...". Just write
--input_data my_data.p
in the Script Parameter fields in your screenshot.
I have a neural network created and trained in CNTK. I can save it with model.save_model("mymodel.dnn") in Python. This produces a file serialized in protobuf format.
How can I either save the model as plain text or convert the .dnn file to plain text?
The format CNTK uses is protobuf. Therefore you can use things like
import google.protobuf.text_format
to create a readable output. This page has further information.
Our protobuf files are currently in this location. I'm hard linking to version 2b9. Make sure you use the right .proto file.
The protobuf compiler can generate textual representation from a binary model file, you just need to point to the CNTK proto definition and tell it to expect a Dictionary inside the model file:
%PROTOBUF_PATH%\bin\protoc --decode CNTK.proto.Dictionary --proto_path [CNTK root]\Source\CNTKv2LibraryDll\proto\ [CNTK root]\Source\CNTKv2LibraryDll\proto\CNTK.proto < mymodel.dnn > mymodel.txt
With Brainscript you can add
command = <yourCommands>:DumpNodeInfo
modelDir = "./ANNmodel"
modelPath = "$modelDir$/NN.dnn"
...
# dump parameter values
DumpNodeInfo = {
action = "dumpNode"
printValues = true
}
Click here for more information.
You can convert the model trained by CNTK to txt format with CNTK's dumpnode command. Here is contents config file txt.conf:
command = convert2txt
convert2txt = [
action = "dumpnode"
modelPath="./cntkSpeechFF.dnn.5"
nodeName = "Prior" # if not specified, all nodes will be printed
outputFile = "./cntkSpeechFF.dnn.5.txt" # the path to the output file. If not specified a file name will be automatically generated based on the modelPath.
printValues = true
printMetadata = true
]
Then you run cntk as
cntk configFile=txt.conf