Fast image sequences / animation in Jupyter Notebook with matplotlib - matplotlib

I can't seem to find a simple and fast way of plotting image sequences with plain matplotlib in a Jupyter Notebook. I've tried FuncAnimation, fig.canvas.draw(), blitting, as well as just the standard imshow-pause combo; without success or with very slow refresh rate. I don't need the images to be interactive - they just need to be shown sequentially and can't pop up a new figure window for each image. I've seen many solutions here, with none seeming to work the way I want.
My general pipeline does significant processing, with each image generated and plotted within a while or for loop. FuncAnimation is not desirable since it requires passing a function handle and my use case involves many arguments and state variables that make it difficult to use.
The best I've got is the working example below using fig.canvas.draw() - showing that drawing time increases linearly per iteration, where I need it to remain constant!
import numpy as np
import matplotlib.pyplot as plt
from timeit import default_timer as timer
%matplotlib notebook
num_iters = 50
im = np.arange(60).reshape((15,4))
fig, ax = plt.subplots(1,1)
fig.show()
fig.canvas.draw()
iter_times = np.zeros(num_iters)
for i in range(num_iters):
im = np.roll( a=im, shift=1, axis=0 )
t0 = timer()
ax.imshow(im.T, vmin=im.min(), vmax=im.max())
ax.set_title('Iter # {}/{}'.format(i+1, num_iters))
fig.canvas.draw()
iter_times[i] = timer()-t0
plt.figure(figsize=(6,3))
plt.plot(np.arange(num_iters)+1, iter_times)
plt.title('Imshow/drawing time per iteration')
plt.xlabel('Iteration number')
plt.ylabel('Time (seconds)')
plt.tight_layout()
plt.show()

I think the problem is that the plots are 'building up', so every one is being plotted every time. If you add ax.clear() right before the imshow(), you'll get linear plot times.

Related

Multiprocessing with Matplotlib Animation (Python)

In short, I am trying to run a live monitoring dashboard while calculating data in the background. Here is the code for my monitoring dashboard:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.animation import FuncAnimation
import psutil
import collections
# function to update the data
def monitor_machine_metrics(i):
cpu_metrics.popleft()
cpu_metrics.append(psutil.cpu_percent(interval=1))
ram_metrics.popleft()
ram_metrics.append(psutil.virtual_memory().percent)
# clear axis
ax.cla()
ax1.cla()
# plot cpu usage
ax.plot(cpu_metrics)
ax.text(len(cpu_metrics)-1, cpu_metrics[-1]+2, "{}%".format(cpu_metrics[-1]))
ax.set_ylim(0,100)
# plot memory usage
ax1.plot(ram_metrics)
ax1.text(len(ram_metrics)-1, ram_metrics[-1]+2, "{}%".format(ram_metrics[-1]))
ax1.set_ylim(0,100)
legend = plt.legend(['CPU Usage', 'RAM Usage'], loc='upper left')
return legend
# start collections with zeros
cpu_metrics = collections.deque(np.zeros(10))
ram_metrics = collections.deque(np.zeros(10))
usage_fig = plt.figure()
ax = plt.subplot()
ax1 = plt.subplot()
def display_dynamic_data(figure, funct, time_interval):
ani = FuncAnimation(figure, funct, interval=time_interval)
plt.show()
For the sake of keeping this simple, let's pretend I just want to print hi while the dashboard is running.
if __name__ == '__main__':
machine_monitoring_process =
multiprocessing.Process(target=display_dynamic_data,args=(usage_fig, monitor_machine_metrics, 1000))
machine_monitoring_process.start()
print('Hi')
Since FuncAnimation runs similar to an infinite loop, I purposely designed for this sub-process to be completely decoupled. While 'hi' prints, the figure comes up but no points will be displayed. However, display_dynamic_data works just fine if we just call the function instead of instantiating it as a sub-process. Unfortunately, I need this to run in parallel, since the real code is doing computations in the background.
Is there something that I am missing here?
Thank you!
When you use multiprocessing.Process, a new process with its own memory and objects is created. Code running in the new process has no effect in the original one. If you want to interchange information, you should use queues or other type of information exchange mechanism.
In your example, display_dynamic_data runs in the new process and exits. Remember that you must keep a reference to the FuncAnimation object to prevent it from being garbage collected.
You can check this answer to grasp what happens when you use multiprocessing.

How to cache a plot in streamlit?

I have built a dashboard in streamlit where you can select a client_ID and have SHAP plots displayed (Waterfall and Force plot) to interpret the prediction of credit default for this client.
I also want to display a SHAP summary plot with the whole train dataset. The later does not change every time you make a new prediction, and takes a lot of time to plot, so I want to cache it. I guess the best approach would be to use st.cache but I have not been able to make it.
Here below is the code I have unsuccessfully tried in main.py:
I first define the function of which I want to cache the output (fig), then I execute the output in st.pyplot. It works without the st.cache decorator, but as soon as I add it and rerun the app, the function summary_plot_all runs indefinitely
IN:
#st.cache
def summary_plot_all():
fig, axes = plt.subplots(nrows=1, ncols=1)
shap.summary_plot(shapvs[1], prep_train.iloc[:, :-1].values,
prep_train.columns, max_display=50)
return fig
st.pyplot(summary_plot_all())
OUT (displayed in streamlit app)
Running summary_plot_all().
Does anyone know what's wrong or a better way of caching a plot in streamlit ?
version of packages:
streamlit==0.84.1,
matplotlib==3.4.2,
shap==0.39.0
Try
import matplotlib
#st.cache(hash_funcs={matplotlib.figure.Figure: lambda _: None})
def summary_plot_all():
fig, axes = plt.subplots(nrows=1, ncols=1)
shap.summary_plot(shapvs[1], prep_train.iloc[:, :-1].values,
prep_train.columns, max_display=50)
return fig
Check this streamlit github issue

Using multiple sliders to manipulate curves in a single graph

I created the following Jupyter Notebook. Here three functions are shifted using three sliders. In the future I would like to generalise it to an arbitrary number of curves (i.e. n-curves). However, right now, the graph updating procedure is very slow and the graph itself doesn't seem to be fixed in the corrispective cell . I didn't receive any error message but I'm pretty sure that there is a mistake in the update function.
Here is the the code
from ipywidgets import interact
import ipywidgets as widgets
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import display
x = np.linspace(0, 2*np.pi, 2000)
y1=np.exp(0.3*x)*np.sin(5*x)
y2=5*np.exp(-x**2)*np.sin(20*x)
y3=np.sin(2*x)
m=[y1,y2,y3]
num_curve=3
def shift(v_X):
v_T=v_X
vector=np.transpose(m)
print(' ')
print(v_T)
print(' ')
curve=vector+v_T
return curve
controls=[]
o='vertical'
for i in range(num_curve):
title="x%i" % (i%num_curve+1)
sl=widgets.FloatSlider(description=title,min=-2.0, max=2.0, step=0.1,orientation=o)
controls.append(sl)
Dict = {}
for c in controls:
Dict[c.description] = c
uif = widgets.HBox(tuple(controls))
def update_N(**xvalor):
xvalor=[]
for i in range(num_curve):
xvalor.append(controls[i].value)
curve=shift(xvalor)
new_curve=pd.DataFrame(curve)
new_curve.plot()
plt.show()
outf = widgets.interactive_output(update_N,Dict)
display(uif, outf)
Your function is running on every single value the slider moves through, which is probably giving you the long times to run you are seeing. You can change this by adding continuous_update=False into your FloatSlider call (line 32).
sl=widgets.FloatSlider(description=title,
min=-2.0,
max=2.0,
step=0.1,
orientation=o,
continuous_update=False)
This got me much better performance, and the chart doesn't flicker as much as there are vastly fewer redraws. Does this help?

Histogram in Bokeh charts takes a looong time

I am trying to move from matplotlib to bokeh. However, I am finding some annoying features. Last I encountered was that it took several minutes to make an histogram of about 1.5M entries - it would have taken a fraction of a second with Matplotlib. Is that normal? And if so, what's the reason?
from bokeh.charts import Histogram, output_file, show
import pandas as pd
output_notebook()
jd1 = pd.read_csv("somefile.csv")
p = Histogram(jd1['QTY'], bins=50)
show(p)
I'm not sure offhand what might be going on with Histogram in your case. Without the data file it's impossible to try and reproduce or debug. But in any case bokeh.charts does not really have a maintainer at the moment, so I would actually just recommend using bokeh.plotting to create your historgam. The bokeh.plotting API is stable (for several years now) and extensively documented. It's a few more lines of code but not many:
import numpy as np
from bokeh.plotting import figure, show, output_notebook
output_notebook()
# synthesize example data
measured = np.random.normal(0, 0.5, 1000)
hist, edges = np.histogram(measured, density=True, bins=50)
p = figure(title="Normal Distribution (μ=0, σ=0.5)")
p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:], line_color=None)
show(p)
As you can see that takes (on my laptop) ~half a second for a 10 million point histogram, including generating synthetic data and binning it.

FuncAnimation does not refresh

I am having problems with FuncAnimation, I am using a slightly modified example http://matplotlib.org/examples/animation/basic_example.html
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
def update_line(num, data, line):
data.pop(0)
data.append(np.random.random())
line.set_ydata(data)
return line,
fig1 = plt.figure()
data = [0.0 for i in xrange(100)]
l, = plt.plot(data, 'r-')
plt.ylim(-1, 1)
line_ani = animation.FuncAnimation(fig1, update_line, 25, fargs=(data, l), interval=50, blit=True)
plt.show()
The problem is that the first line (updated by update_line) stays in background.
If I resize the window (click on the corner of the window and moving the mouse). This first line disappears but now the first line after the resize stays in background.
Is this normal, or I am doing something wrong.
Thanks in advance
If you are not overly worried about speed, remove blit=True and it should work.
Blitting is a way to save time by only re-drawing the bits of the figure that have changed (instead of everything), but is easy to get messed up. If you do not include blit=True all the artists get re-drawn every time.
I recommend reading python matplotlib blit to axes or sides of the figure? and Animating Matplotlib panel - blit leaves old frames behind.