'%matplotlib notebook' behavior in Jupyter Lab [duplicate] - ipython-magic

With old Jupyter notebooks, I could create interactive plots via:
import matplotlib.pyplot as plt
%matplotlib notebook
x = [1,2,3]
y = [4,5,6]
plt.figure()
plt.plot(x,y)
However, in JupyterLab, this gives an error:
JavaScript output is disabled in JupyterLab
I have also tried the magic (with jupyter-matplotlib installed):
%matplotlib ipympl
But that just returns:
FigureCanvasNbAgg()
Inline plots work, but they are not interactive plots:
%matplotlib inline

JupyterLab 3.0+
Install jupyterlab and ipympl.
For pip users:
pip install --upgrade jupyterlab ipympl
For conda users:
conda update -c conda-forge jupyterlab ipympl
Restart JupyterLab.
Decorate the cell containing plotting code with the header:
%matplotlib widget
# plotting code goes here
JupyterLab 2.0
Install nodejs, e.g. conda install -c conda-forge nodejs.
Install ipympl, e.g. conda install -c conda-forge ipympl.
[Optional, but recommended.] Update JupyterLab, e.g.
conda update -c conda-forge jupyterlab==2.2.9==py_0.
[Optional, but recommended.] For a local user installation, run:
export JUPYTERLAB_DIR="$HOME/.local/share/jupyter/lab".
Install extensions:
jupyter labextension install #jupyter-widgets/jupyterlab-manager
jupyter labextension install jupyter-matplotlib
Enable widgets: jupyter nbextension enable --py widgetsnbextension.
Restart JupyterLab.
Decorate with %matplotlib widget.

To enable the jupyter-matplotlib backend, use the matplotlib Jupyter magic:
%matplotlib widget
import matplotlib.pyplot as plt
plt.figure()
x = [1,2,3]
y = [4,5,6]
plt.plot(x,y)
More info here jupyter-matplotlib on GitHub

As per Georgy's suggestion, this was caused by Node.js not being installed.

Steps for JupyterLab 3.*
I had previously used Mateen's answer several times, but when I tried them with JupyterLab 3.0.7 I found that jupyter labextension install #jupyter-widgets/jupyterlab-manager returned an error and I had broken widgets.
After a lot of headaches and googling I thought I would post the solution for anyone else who finds themselves here.
The steps are now simplified, and I was able to get back to working interactive plots with the following:
pip install jupyterlab
pip install ipympl
Decorate with %matplotlib widget
Step 2 will automatically take care of the rest of the dependencies, including the replacements for (the now depreciated?) #jupyter-widgets/jupyterlab-manager
Hope this saves someone else some time!

Summary
In a complex setup, where jupyter-lab process and the Jupyter/IPython kernel process are running in different Python virtual environments, pay attention to Jupyter-related Python package and Jupyter extension (e.g. ipympl, jupyter-matplotlib) versions and their compatibility between the environments.
And even in single Python virtual environment make sure you comply with the ipympl compatibility table.
Example
A couple of examples how to run JupyterLab.
Simple(st)
The simplest cross-platform way to run JupyterLab, I guess, is running it from a Docker container. You can build and run JupyterLab 3 container like this.
docker run --name jupyter -it -p 8888:8888 \
# This line on a Linux- and non-user-namespaced Docker will "share"
# the directory between Docker host and container, and run from the user.
-u 1000 -v $HOME/Documents/notebooks:/tmp/notebooks \
-e HOME=/tmp/jupyter python:3.8 bash -c "
mkdir /tmp/jupyter; \
pip install --user 'jupyterlab < 4' 'ipympl < 0.8' pandas matplotlib; \
/tmp/jupyter/.local/bin/jupyter lab --ip=0.0.0.0 --port 8888 \
--no-browser --notebook-dir /tmp/notebooks;
"
When it finishes (and it'll take a while), the bottommost lines in the terminal should be something like.
To access the server, open this file in a browser:
...
http://127.0.0.1:8888/lab?token=abcdef...
You can just click on that link and JupyterLab should open in your browser. Once you shut down the JupyterLab instance the container will stop. You can restart it with docker start -ai jupyter.
Complex
This GitHub Gist illustrates the idea how to build a Python virtual environment with JupyterLab 2 and also building all required extensions with Nodejs in the container, without installing Nodejs on host system. With JupyterLab 3 and pre-build extensions this approach gets less relevant.
Context
I was scratching my head today while debugging the %matplotlib widget not working in JupyterLab 2. I have separate pre-built JupyterLab venv (as described above) which powers local JupyterLab as Chromium "app mode" (i.e. c.LabApp.browser = 'chromium-browser --app=%s' in the config), and a few IPython kernels from simple Python venvs with specific dependencies (rarely change) and an application exposing itself as an IPython kernel. The issue with the interactive "widget" mode manifested in different ways.
For instance, having
in JupyterLab "host" venv: jupyter-matplotlib v0.7.4 extension and ipympl==0.6.3
in the kernel venv: ipympl==0.7.0 and matplotlib==3.4.2
In the browser console I had these errors:
Error: Module jupyter-matplotlib, semver range ^0.9.0 is not registered as a widget module
Error: Could not create a model.
Could not instantiate widget
In the JupyterLab UI:
%matplotlib widget succeeds on restart
Charts stuck in "Loading widget..."
Nothing on re-run of the cell with chart output
On previous attempts %matplotlib widget could raise something like KeyError: '97acd0c8fb504a2288834b349003b4ae'
On downgrade of ipympl==0.6.3 in the kernel venv in the browser console:
Could not instantiate widget
Exception opening new comm
Error: Could not create a model.
Module jupyter-matplotlib, semver range ^0.8.3 is not registered as a widget module
Once I made the packages/extensions according to ipympl compatibility table:
in JupyterLab "host" venv: jupyter-matplotlib v0.8.3 extension, ipympl==0.6.3
in the kernel venv: ipympl==0.6.3, matplotlib==3.3.4
It more or less works as expected. Well, there are verious minor glitches like except I put %matplotlib widget per cell with chart, say on restart, the first chart "accumulates" all the contents of all the charts in the notebook. With %matplotlib widget per cell, only one chart is "active" at a time. And on restart only last widget is rendered (but manual re-run of a cell remediates).

This solution works in jupyterlab
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import clear_output
n = 10
a = np.zeros((n, n))
plt.figure()
for i in range(n):
plt.imshow(a)
plt.show()
a[i, i] = 1
clear_output(wait=True)

Related

How to make persistent installation with Colaboratory?

Is there a way to have a persistent installation on the machine you get when you launch a notebook from the Colaboratory Environment ?
There is such a mechanism with mybinder.org with a requirements.txt or setup.py that specify the different packages you want at the startup.
https://mybinder.readthedocs.io/en/latest/config_files.html#requirements-txt-install-a-python-environment
I have tested a colab notebook with an installation procedure but I have to rerun a sequence of cells each time I want to work.
Try:
https://colab.research.google.com/drive/1u5Y-92-b4rVcJjkUpPPa5xnuvKAHcnNa
Also how to define environment variables for once (at startup) ?
Do I have also to rerun their settings each time ?
Thanks
Patrick
Here's how I install a library jdc permanently, by installing it in Google Drive.
import os, sys
from google.colab import drive
drive.mount('/content/mnt')
nb_path = '/content/notebooks'
os.symlink('/content/mnt/My Drive/Colab Notebooks', nb_path)
sys.path.insert(0, nb_path)
# call this one time only
!pip install --target=$nb_path jdc
# later just import it
import jdc # for %%add_to
And here's how to set environmental variables.
%env VAR1=value1
%env VAR2=value2
Put them in your first cell and run it.

py:445: UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure. % get_backend())

I was trying to run the Mask-RCNN repository provided by the matterport in Github. https://github.com/matterport/Mask_RCNN. when I run the demo in the anaconda, it showed "C:\Anaconda\lib\site-packages\matplotlib\figure.py:445: UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure. % get_backend()) ". Is there someone came cross the similar problem?
If you are trying to plot on a remote server, ssh X11 forwarding can then be used to display matplotlib plots.
Try to use this,
import matplotlib
matplotlib.use('tkagg')
Make sure you have XMing or XQuartz (if you're on mac), and use -Y
$ shh -Y username#servidorIP
Enter or change the line in ~/.config/matplotlib/matplotlibrc starting with "backend : " in:
backend : tkagg

Install RAPIDS library on Googe Colab notebook

I was wondering if I could install RAPIDS library (executing machine learning tasks entirely on GPU) in Google Colaboratory notebook?
I've done some research but I've not been able to find the way to do that...
This is now possible with the new T4 instances https://medium.com/rapids-ai/run-rapids-on-google-colab-for-free-1617ac6323a8
To enable cuGraph too, you can replace the wget command with:
!conda install -c nvidia/label/cuda10.0 -c rapidsai/label/cuda10.0 -c pytorch \
-c numba -c conda-forge -c numba -c defaults \
boost cudf=0.6 cuml=0.6 python=3.6 cugraph=0.6 -y
Dec 2019 update
New process for RAPIDS v0.11+
Because
RAPIDS v0.11 has dependencies (pyarrow) which were
not covered by the prior install script,
the notebooks-contrib repo, which contains RAPIDS demo notebooks (e.g.
colab_notebooks) and the Colab install script, now follows RAPIDS standard version-specific branch structure*
and some Colab users still enjoy v0.10,
our honorable notebooks-contrib overlord taureandyernv has updated the script which now:
If running v0.11 or higher, updates pyarrow library to 0.15.x.
Here's the code cell to run in Colab for v0.11:
# Install RAPIDS
!wget -nc https://raw.githubusercontent.com/rapidsai/notebooks-contrib/890b04ed8687da6e3a100c81f449ff6f7b559956/utils/rapids-colab.sh
!bash rapids-colab.sh
import sys, os
dist_package_index = sys.path.index("/usr/local/lib/python3.6/dist-packages")
sys.path = sys.path[:dist_package_index] + ["/usr/local/lib/python3.6/site-packages"] + sys.path[dist_package_index:]
sys.path
if os.path.exists('update_pyarrow.py'): ## This file only exists if you're using RAPIDS version 0.11 or higher
exec(open("update_pyarrow.py").read(), globals())
For a walk thru setting up Colab & implementing this script, see How to Install RAPIDS in Google Colab
-* e.g. branch-0.11 for v0.11 and branch-0.12 for v0.12 with default set to the current version
Looks like various subparts are not yet pip-installable so the only way to get them on colab would be to build them on colab, which might be more effort than you're interested in investing in this :)
https://github.com/rapidsai/cudf/issues/285 is the issue to watch for rapidsai/cudf (presumably the other rapidsai/ libs will follow suit).
Latest solution;
!wget -nc https://github.com/rapidsai/notebooks-extended/raw/master/utils/rapids-colab.sh
!bash rapids-colab.sh
import sys, os
sys.path.append('/usr/local/lib/python3.6/site-packages/')
os.environ['NUMBAPRO_NVVM'] = '/usr/local/cuda/nvvm/lib64/libnvvm.so'
os.environ['NUMBAPRO_LIBDEVICE'] = '/usr/local/cuda/nvvm/libdevice/'
was pushed a few days ago, see issues #104 or #110, or the full rapids-colab.sh script for more info.
Note: instillation currently requires a Tesla T4 instance, checking for this can be done with;
# check gpu type
!nvidia-smi
import pynvml
pynvml.nvmlInit()
handle = pynvml.nvmlDeviceGetHandleByIndex(0)
device_name = pynvml.nvmlDeviceGetName(handle)
# your dolphin is broken, please reset & try again
if device_name != b'Tesla T4':
raise Exception("""Unfortunately this instance does not have a T4 GPU.
Please make sure you've configured Colab to request a GPU instance type.
Sometimes Colab allocates a Tesla K80 instead of a T4. Resetting the instance.
If you get a K80 GPU, try Runtime -> Reset all runtimes...""")
# got a T4, good to go
else:
print('Woo! You got the right kind of GPU!')

How to render OpenAI gym in google Colab? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm trying to use OpenAI gym in google colab. As the Notebook is running on a remote server I can not render gym's environment.
I found some solution for Jupyter notebook, however, these solutions do not work with colab as I don't have access to the remote server.
I wonder if someone knows a workaround for this that works with google Colab?
Korakot's answer is not correct.
You can indeed render OpenAi Gym in colaboratory, albiet kind of slowly using none other than matplotlib.
Heres how:
Install xvfb & other dependencies (Thanks to Peter for his comment)
!apt-get install x11-utils > /dev/null 2>&1
!pip install pyglet > /dev/null 2>&1
!apt-get install -y xvfb python-opengl > /dev/null 2>&1
As well as pyvirtual display:
!pip install gym pyvirtualdisplay > /dev/null 2>&1
then import all your libraries, including matplotlib & ipythondisplay:
import gym
import numpy as np
import matplotlib.pyplot as plt
from IPython import display as ipythondisplay
then you want to import Display from pyvirtual display & initialise your screen size, in this example 400x300... :
from pyvirtualdisplay import Display
display = Display(visible=0, size=(400, 300))
display.start()
last but not least, using gym's "rgb_array" render functionally, render to a "Screen" variable, then plot the screen variable using Matplotlib! (rendered indirectly using Ipython display)
env = gym.make("CartPole-v0")
env.reset()
prev_screen = env.render(mode='rgb_array')
plt.imshow(prev_screen)
for i in range(50):
action = env.action_space.sample()
obs, reward, done, info = env.step(action)
screen = env.render(mode='rgb_array')
plt.imshow(screen)
ipythondisplay.clear_output(wait=True)
ipythondisplay.display(plt.gcf())
if done:
break
ipythondisplay.clear_output(wait=True)
env.close()
Link to my working Colaboratory notebook demoing cartpole:
https://colab.research.google.com/drive/16gZuQlwxmxR5ZWYLZvBeq3bTdFfb1r_6
Note: not all Gym Environments support "rgb_array" render mode, but most of the basic ones do.
Try this :-
!apt-get install python-opengl -y
!apt install xvfb -y
!pip install pyvirtualdisplay
!pip install piglet
from pyvirtualdisplay import Display
Display().start()
import gym
from IPython import display
import matplotlib.pyplot as plt
%matplotlib inline
env = gym.make('CartPole-v0')
env.reset()
img = plt.imshow(env.render('rgb_array')) # only call this once
for _ in range(40):
img.set_data(env.render('rgb_array')) # just update the data
display.display(plt.gcf())
display.clear_output(wait=True)
action = env.action_space.sample()
env.step(action)
This worked for me so I guess it should also work for you.
I recently had to solve the same problem and have written up a blog post with my solution. For ease of reference I am re-posting the TLDR; version here.
Paste this code into a cell in Colab and run it to install all of the dependencies.
%%bash
# install required system dependencies
apt-get install -y xvfb x11-utils
# install required python dependencies (might need to install additional gym extras depending)
pip install gym[box2d]==0.17.* pyvirtualdisplay==0.2.* PyOpenGL==3.1.* PyOpenGL-accelerate==3.1.*
And then start a virtual display in the background.
import pyvirtualdisplay
_display = pyvirtualdisplay.Display(visible=False, # use False with Xvfb
size=(1400, 900))
_ = _display.start()
In the blog post I also provide a sample simulation demo that demonstrates that the above actually works.
By far the best solution I found after spending countless hours on this issue is to record & play video. It comes UX wise very close to the real render function.
Here is a Google colab notebook that records & renders video.
https://colab.research.google.com/drive/12osEZByXOlGy8J-MSpkl3faObhzPGIrB
enjoy :)

Shipping and using virtualenv in a pyspark job

PROBLEM: I am attempting to run a spark-submit script from my local machine to a cluster of machines. The work done by the cluster uses numpy. I currently get the following error:
ImportError:
Importing the multiarray numpy extension module failed. Most
likely you are trying to import a failed build of numpy.
If you're working with a numpy git repo, try `git clean -xdf` (removes all
files not under version control). Otherwise reinstall numpy.
Original error was: cannot import name multiarray
DETAIL:
In my local environment I have setup a virtualenv that includes numpy as well as a private repo I use in my project and other various libraries. I created a zip file (lib/libs.zip) from the site-packages directory at venv/lib/site-packages where 'venv' is my virtual environment. I ship this zip to the remote nodes. My shell script for performing the spark-submit looks like this:
$SPARK_HOME/bin/spark-submit \
--deploy-mode cluster \
--master yarn \
--conf spark.pyspark.virtualenv.enabled=true \
--conf spark.pyspark.virtualenv.type=native \
--conf spark.pyspark.virtualenv.requirements=${parent}/requirements.txt \
--conf spark.pyspark.virtualenv.bin.path=${parent}/venv \
--py-files "${parent}/lib/libs.zip" \
--num-executors 1 \
--executor-cores 2 \
--executor-memory 2G \
--driver-memory 2G \
$parent/src/features/pi.py
I also know that on the remote nodes there is a /usr/local/bin/python2.7 folder that includes a python 2.7 install.
so in my conf/spark-env.sh I have set the following:
export PYSPARK_PYTHON=/usr/local/bin/python2.7
export PYSPARK_DRIVER_PYTHON=/usr/local/bin/python2.7
When I run the script I get the error above. If I screen print the installed_distributions I get a zero length list []. Also my private library imports correctly (which says to me it is actually accessing my libs.zip site-packages.). My pi.py file looks something like this:
from myprivatelibrary.bigData.spark import spark_context
spark = spark_context()
import numpy as np
spark.parallelize(range(1, 10)).map(lambda x: np.__version__).collect()
EXPECTATION/MY THOUGHTS:
I expect this to import numpy correctly especially since I know numpy works correctly in my local virtualenv. I suspect this is because I'm not actually using the version of python that is installed in my virtualenv on the remote node. My question is first, how do I fix this and second how do I use my virtualenv installed python on the remote nodes instead of the python that is just manually installed and currently sitting on those machines? I've seen some write-ups on this but frankly they are not well written.
With --conf spark.pyspark.{} and export PYSPARK_PYTHON=/usr/local/bin/python2.7 you set options for your local environment / your driver. To set options for the cluster (executors) use the following syntax:
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON
Furthermore, I guess you should make your virtualenv relocatable (this is experimental, however). <edit 20170908> This means that the virtualenv uses relative instead of absolute links. </edit>
What we did in such cases: we shipped an entire anaconda distribution over hdfs.
<edit 20170908>
If we are talking about different environments (MacOs vs. Linux, as mentioned in the comment below), you cannot just submit a virtualenv, at least not if your virtualenv contains packages with binaries (as is the case with numpy). In that case I suggest you create yourself a 'portable' anaconda, i.e. install Anaconda in a Linux VM and zip it.
Regarding --archives vs. --py-files:
--py-files adds python files/packages to the python path. From the spark-submit documentation:
For Python applications, simply pass a .py file in the place of instead of a JAR, and add Python .zip, .egg or .py files to the search path with --py-files.
--archives means these are extracted into the working directory of each executor (only yarn clusters).
However, a crystal-clear distinction is lacking, in my opinion - see for example this SO post.
In the given case, add the anaconda.zip via --archives, and your 'other python files' via --py-files.
</edit>
See also: Running Pyspark with Virtualenv, a blog post by Henning Kropp.