I have built a dashboard in streamlit where you can select a client_ID and have SHAP plots displayed (Waterfall and Force plot) to interpret the prediction of credit default for this client.
I also want to display a SHAP summary plot with the whole train dataset. The later does not change every time you make a new prediction, and takes a lot of time to plot, so I want to cache it. I guess the best approach would be to use st.cache but I have not been able to make it.
Here below is the code I have unsuccessfully tried in main.py:
I first define the function of which I want to cache the output (fig), then I execute the output in st.pyplot. It works without the st.cache decorator, but as soon as I add it and rerun the app, the function summary_plot_all runs indefinitely
IN:
#st.cache
def summary_plot_all():
fig, axes = plt.subplots(nrows=1, ncols=1)
shap.summary_plot(shapvs[1], prep_train.iloc[:, :-1].values,
prep_train.columns, max_display=50)
return fig
st.pyplot(summary_plot_all())
OUT (displayed in streamlit app)
Running summary_plot_all().
Does anyone know what's wrong or a better way of caching a plot in streamlit ?
version of packages:
streamlit==0.84.1,
matplotlib==3.4.2,
shap==0.39.0
Try
import matplotlib
#st.cache(hash_funcs={matplotlib.figure.Figure: lambda _: None})
def summary_plot_all():
fig, axes = plt.subplots(nrows=1, ncols=1)
shap.summary_plot(shapvs[1], prep_train.iloc[:, :-1].values,
prep_train.columns, max_display=50)
return fig
Check this streamlit github issue
Related
I have a matplotlib chart working nicely as a python script. I need to create this chart style in flask. Can't use this method within flask as flask thread management doesn't play with matplotlib.
Oddly, the current method will run once successfully, subsequent runs will produce this error.
RuntimeError: main thread is not in main loop
So this is my desired chart format to produce in flask.
the code I'm using currently.
fig, ax = plt.subplots()
fig.subplots_adjust(bottom=0.29, top=.91)
ax.set_title(title)
ax.set_ylabel("y label text")
ax.set_xlabel('x label text')
ax.tick_params(axis='x', labelrotation = -80)
l = ax.plot(df_output['column1'])
y_error = df_output['column2']
plt.errorbar(list(df_output.index), \
list(df_output['column1']), \
yerr = y_error,fmt='o',ecolor = 'blue',color='blue')
fig.legend(l, loc=8, labels=labels)
#loc=2 = top left corner, loc=8 = 'lower center'
#plt.show()
plt.savefig(output_path+"/"+title+'_errorbars.png')
I found this example that works with flask
https://gist.github.com/illume/1f19a2cf9f26425b1761b63d9506331f
it uses this matplotlib charting syntax. Need to convert my old matplotlib format to suit the flask compatible format. Is this chart format possible via FigureCanvasAgg?
fig = Figure()
axis = fig.add_subplot(1, 1, 1)
print("type(axis):", type(axis))
x_points = data.iloc[:, 0]
y_points = data['mean minus sterility control mean']
axis.plot(x_points, y_points)
output = io.BytesIO()
FigureCanvasAgg(fig).print_png(output)
return Response(output.getvalue(), mimetype="image/png")
I'll admit to not being strong in building matpotlib charts. changing between chart building methods throws me.
I'm digging around the docs at moment.
https://matplotlib.org/stable/gallery/user_interfaces/canvasagg.html
I did find this Q&A (RuntimeError: main thread is not in main loop with Matplotlib and Flask)
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
while this appears to run for me. I want to move away from creating charts as files on the server, too much potential for file mismanagement, creating the chart as a io.BytesIO() output (or some format within the flask http response to user) is a much better solution.
(I'd like to keep at an image output, rather than change architecture to (say) a json output and constructing chart in client using javascript libraries)
In short, I am trying to run a live monitoring dashboard while calculating data in the background. Here is the code for my monitoring dashboard:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.animation import FuncAnimation
import psutil
import collections
# function to update the data
def monitor_machine_metrics(i):
cpu_metrics.popleft()
cpu_metrics.append(psutil.cpu_percent(interval=1))
ram_metrics.popleft()
ram_metrics.append(psutil.virtual_memory().percent)
# clear axis
ax.cla()
ax1.cla()
# plot cpu usage
ax.plot(cpu_metrics)
ax.text(len(cpu_metrics)-1, cpu_metrics[-1]+2, "{}%".format(cpu_metrics[-1]))
ax.set_ylim(0,100)
# plot memory usage
ax1.plot(ram_metrics)
ax1.text(len(ram_metrics)-1, ram_metrics[-1]+2, "{}%".format(ram_metrics[-1]))
ax1.set_ylim(0,100)
legend = plt.legend(['CPU Usage', 'RAM Usage'], loc='upper left')
return legend
# start collections with zeros
cpu_metrics = collections.deque(np.zeros(10))
ram_metrics = collections.deque(np.zeros(10))
usage_fig = plt.figure()
ax = plt.subplot()
ax1 = plt.subplot()
def display_dynamic_data(figure, funct, time_interval):
ani = FuncAnimation(figure, funct, interval=time_interval)
plt.show()
For the sake of keeping this simple, let's pretend I just want to print hi while the dashboard is running.
if __name__ == '__main__':
machine_monitoring_process =
multiprocessing.Process(target=display_dynamic_data,args=(usage_fig, monitor_machine_metrics, 1000))
machine_monitoring_process.start()
print('Hi')
Since FuncAnimation runs similar to an infinite loop, I purposely designed for this sub-process to be completely decoupled. While 'hi' prints, the figure comes up but no points will be displayed. However, display_dynamic_data works just fine if we just call the function instead of instantiating it as a sub-process. Unfortunately, I need this to run in parallel, since the real code is doing computations in the background.
Is there something that I am missing here?
Thank you!
When you use multiprocessing.Process, a new process with its own memory and objects is created. Code running in the new process has no effect in the original one. If you want to interchange information, you should use queues or other type of information exchange mechanism.
In your example, display_dynamic_data runs in the new process and exits. Remember that you must keep a reference to the FuncAnimation object to prevent it from being garbage collected.
You can check this answer to grasp what happens when you use multiprocessing.
I have plotted a histogram and would like to modify it, then re-plot it. It won't plot again without redefining the Figure and Axes object definitions. I'm using Jupyter Notebook, and I'm new to matplotlib, so I don't know if this is something that I'm not understanding about matplotlib, if it's an issue with the Jupyter Notebook or something else.
Here's my 1st block of code:
"""Here's some data."""
some_data = np.random.randn(150)
"""Here I define my `Figure` and `Axes` objects."""
fig, ax = plt.subplots()
"""Then I make a histogram from them, and it shows up just fine."""
ax.hist(some_data, range=(0, 5))
plt.show()
Here's the output from my 1st block of code:
Here's my 2nd block of code:
"""Here I modify the parameter `bins`."""
ax.hist(some_data, bins=20, range=(0, 5))
"""When I try to make a new histogram, it doesn't work."""
plt.show()
My 2nd block of code generates no visible output, which is the problem.
Here's my 3rd and final block of code:
"""But it does work if I define new `Figure` and `Axes` objects.
Why is this?
How can I display new, modified plots without defining new `Figure` and/or `Axes` objects? """
new_fig, new_ax = plt.subplots()
new_ax.hist(some_data, bins=20, range=(0, 5))
plt.show()
Here's the output from my 3rd and final block of code:
Thanks in advance.
When you generate a figure or an axis, it remains accessible for rendering or display until it's used for rendering or display. Once you execute plt.show() in your first block, the ax becomes unavailable. Your 3rd block of code is showing a plot because you're regenerating the figure and axes.
I can't seem to find a simple and fast way of plotting image sequences with plain matplotlib in a Jupyter Notebook. I've tried FuncAnimation, fig.canvas.draw(), blitting, as well as just the standard imshow-pause combo; without success or with very slow refresh rate. I don't need the images to be interactive - they just need to be shown sequentially and can't pop up a new figure window for each image. I've seen many solutions here, with none seeming to work the way I want.
My general pipeline does significant processing, with each image generated and plotted within a while or for loop. FuncAnimation is not desirable since it requires passing a function handle and my use case involves many arguments and state variables that make it difficult to use.
The best I've got is the working example below using fig.canvas.draw() - showing that drawing time increases linearly per iteration, where I need it to remain constant!
import numpy as np
import matplotlib.pyplot as plt
from timeit import default_timer as timer
%matplotlib notebook
num_iters = 50
im = np.arange(60).reshape((15,4))
fig, ax = plt.subplots(1,1)
fig.show()
fig.canvas.draw()
iter_times = np.zeros(num_iters)
for i in range(num_iters):
im = np.roll( a=im, shift=1, axis=0 )
t0 = timer()
ax.imshow(im.T, vmin=im.min(), vmax=im.max())
ax.set_title('Iter # {}/{}'.format(i+1, num_iters))
fig.canvas.draw()
iter_times[i] = timer()-t0
plt.figure(figsize=(6,3))
plt.plot(np.arange(num_iters)+1, iter_times)
plt.title('Imshow/drawing time per iteration')
plt.xlabel('Iteration number')
plt.ylabel('Time (seconds)')
plt.tight_layout()
plt.show()
I think the problem is that the plots are 'building up', so every one is being plotted every time. If you add ax.clear() right before the imshow(), you'll get linear plot times.
I need to create a figure in a file without displaying it within IPython notebook. I am not clear on the interaction between IPython and matplotlib.pylab in this regard. But, when I call pylab.savefig("test.png") the current figure get's displayed in addition to being saved in test.png. When automating the creation of a large set of plot files, this is often undesirable. Or in the situation that an intermediate file for external processing by another app is desired.
Not sure if this is a matplotlib or IPython notebook question.
This is a matplotlib question, and you can get around this by using a backend that doesn't display to the user, e.g. 'Agg':
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
plt.plot([1,2,3])
plt.savefig('/tmp/test.png')
EDIT: If you don't want to lose the ability to display plots, turn off Interactive Mode, and only call plt.show() when you are ready to display the plots:
import matplotlib.pyplot as plt
# Turn interactive plotting off
plt.ioff()
# Create a new figure, plot into it, then close it so it never gets displayed
fig = plt.figure()
plt.plot([1,2,3])
plt.savefig('/tmp/test0.png')
plt.close(fig)
# Create a new figure, plot into it, then don't close it so it does get displayed
plt.figure()
plt.plot([1,3,2])
plt.savefig('/tmp/test1.png')
# Display all "open" (non-closed) figures
plt.show()
We don't need to plt.ioff() or plt.show() (if we use %matplotlib inline). You can test above code without plt.ioff(). plt.close() has the essential role. Try this one:
%matplotlib inline
import pylab as plt
# It doesn't matter you add line below. You can even replace it by 'plt.ion()', but you will see no changes.
## plt.ioff()
# Create a new figure, plot into it, then close it so it never gets displayed
fig = plt.figure()
plt.plot([1,2,3])
plt.savefig('test0.png')
plt.close(fig)
# Create a new figure, plot into it, then don't close it so it does get displayed
fig2 = plt.figure()
plt.plot([1,3,2])
plt.savefig('test1.png')
If you run this code in iPython, it will display a second plot, and if you add plt.close(fig2) to the end of it, you will see nothing.
In conclusion, if you close figure by plt.close(fig), it won't be displayed.