Update data point labels in bokeh plot - matplotlib

I use bokeh in an ipython notebook and would like to have a button next to a plot to switch on or off labels of the data points. I found a solution using IPython.html.widgets.interact, but this solution resets the plot for each update including zooming and padding
This is the minimal working code example:
from numpy.random import random
from bokeh.plotting import figure, show, output_notebook
from IPython.html.widgets import interact
def plot(label_flag):
p = figure()
N = 10
x = random(N)+2
y = random(N)+2
labels = range(N)
p.scatter(x, y)
if label_flag:
pass
p.text(x, y, labels)
output_notebook()
show(p)
interact(plot, label_flag=True)
p.s. If there is an easy way to do this in matplotlib I would also switch back again.

By using bokeh.models.ColumnDataSource to store and change the plot's data I was able to achieve what I wanted.
One caveat is, that I found no way to make it work w/o refresh w/o calling output_notebook twice in two different cells. If I remove one of the two output_notebook calls the gui of the tools-button looks breaks or changing a setting also results in a reset of the plot.
from numpy.random import random
from bokeh.plotting import figure, show, output_notebook
from IPython.html.widgets import interact
from bokeh.models import ColumnDataSource
output_notebook()
## <-- new cell -->
p = figure()
N = 10
x_data = random(N)+2
y_data = random(N)+2
labels = range(N)
source = ColumnDataSource(
data={
'x':x_data,
'y':y_data,
'desc':labels
}
)
p.scatter('x', 'y', source=source)
p.text('x', 'y', 'desc', source=source)
output_notebook()
def update_plot(label_flag=True):
if label_flag:
source.data['desc'] = range(N)
else:
source.data['desc'] = ['']*N
show(p)
interact(update_plot, label_flag=True)

Related

Equivalent of Hist()'s Layout hyperparameter in Sns.Pairplot?

Am trying to find hist()'s figsize and layout parameter for sns.pairplot().
I have a pairplot that gives me nice scatterplots between the X's and y. However, it is oriented horizontally and there is no equivalent layout parameter to make them vertical to my knowledge. 4 plots per row would be great.
This is my current sns.pairplot():
sns.pairplot(X_train,
x_vars = X_train.select_dtypes(exclude=['object']).columns,
y_vars = ["SalePrice"])
This is what I would like it to look like: Source
num_mask = train_df.dtypes != object
num_cols = train_df.loc[:, num_mask[num_mask == True].keys()]
num_cols.hist(figsize = (30,15), layout = (4,10))
plt.show()
What you want to achieve isn't currently supported by sns.pairplot, but you can use one of the other figure-level functions (sns.displot, sns.catplot, ...). sns.lmplot creates a grid of scatter plots. For this to work, the dataframe needs to be in "long form".
Here is a simple example. sns.lmplot has parameters to leave out the regression line (fit_reg=False), to set the height of the individual subplots (height=...), to set its aspect ratio (aspect=..., where the subplot width will be height times aspect ratio), and many more. If all y ranges are similar, you can use the default sharey=True.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
# create some test data with different y-ranges
np.random.seed(20230209)
X_train = pd.DataFrame({"".join(np.random.choice([*'uvwxyz'], np.random.randint(3, 8))):
np.random.randn(100).cumsum() + np.random.randint(100, 1000) for _ in range(10)})
X_train['SalePrice'] = np.random.randint(10000, 100000, 100)
# convert the dataframe to long form
# 'SalePrice' will get excluded automatically via `melt`
compare_columns = X_train.select_dtypes(exclude=['object']).columns
long_df = X_train.melt(id_vars='SalePrice', value_vars=compare_columns)
# create a grid of scatter plots
g = sns.lmplot(data=long_df, x='SalePrice', y='value', col='variable', col_wrap=4, sharey=False)
g.set(ylabel='')
plt.show()
Here is another example, with histograms of the mpg dataset:
import matplotlib.pyplot as plt
import seaborn as sns
mpg = sns.load_dataset('mpg')
compare_columns = mpg.select_dtypes(exclude=['object']).columns
mpg_long = mpg.melt(value_vars=compare_columns)
g = sns.displot(data=mpg_long, kde=True, x='value', common_bins=False, col='variable', col_wrap=4, color='crimson',
facet_kws={'sharex': False, 'sharey': False})
g.set(xlabel='')
plt.show()

pick_event in Jupyter with matplotlib scatter plot

I really like the simplicity with how ipywidgets.interactive works with pandas dataframe but I am having trouble getting data when a point in a scatter plot is selected.
I have looked at some examples that use matplotlib.widgets etc. but none that use it with interactive in Jupyter. It looks like this technique would be described here but it comes up just short:
http://minrk-ipywidgets.readthedocs.io/en/latest/examples/Using%20Interact.html
Here is an ipynb of what I am trying to accomplish:
from ipywidgets import interactive
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.widgets import Button
from matplotlib.text import Annotation
from io import StringIO
data_ssv = """tone_amp_0 tone_freq_0 SNR
75.303 628.0 68.374
84.902 8000.0 61.292
92.856 288.0 70.545
70.000 2093.0 35.036
76.511 6834.0 66.952 """
data = pd.read_table(StringIO(data_ssv), sep="\s+", header=0)
col_names=list(data.columns.values)
plottable_col=( ['tone_amp_0', 'tone_freq_0', 'SNR'] )
def annotate(axis, text, x, y):
text_annotation = Annotation(text, xy=(x, y), xycoords='data')
axis.add_artist(text_annotation)
def onpick(event):
ind = event.ind
label_pos_x = event.mouseevent.xdata
label_pos_y = event.mouseevent.ydata
offset = 0 # just in case two dots are very close, this offset will help the labels not appear one on top of each other
for i in ind: # if the dots are to close one to another, a list of dots clicked is returned by the matplotlib library
label = "gen_labels" # generated_labels[i]
print( "index", i, label ) # step 4: log it for debugging purposes
ax=plt.gca()
annotate(ax,label,label_pos_x + offset,label_pos_y + offset)
ax.figure.canvas.draw_idle()
offset += 0.01 # alter the offset just in case there are more than one dots affected by the click
def update_plot(X='tone_amp_0', Y='tone_frq_0', Z='SNR'):
plt.scatter( data.loc[:, [X]],data.loc[:, [Y]], marker='.', edgecolors='none', c=data.loc[:,[Z]], picker=True, cmap='RdYlGn' )
plt.title(X+' vs '+Y); plt.xlabel(X); plt.ylabel(Y); plt.colorbar().set_label(Z, labelpad=+1)
plt.grid(); plt.show()
plt.gcf().canvas.mpl_connect('pick_event', onpick)
interactive(update_plot, X=plottable_col, Y=plottable_col, Z=plottable_col)
When I select a data point nothing is happening. Not sure how to debug this or understand what I am doing wrong. Can someone point out what I am doing wrong here?
Try put a semicolon at the end of plt.gcf().canvas.mpl_connect('pick_event', onpick).

Interactively annotating points in scatter plot using Bokeh

I'm trying to use Bokeh to build an interactive tool that allows a user to select a subset of points from a scatter plot and to subsequently label or annotate those points. Ideally, the user-provided input would update a "label" field for that sample's row in a dataframe.
The code below allows the user to select the points, but how do I make it so that they can then label those selected points from a text-input widget e.g. text = TextInput(value="default", title="Label:")
, and in so doing, change the "label" field for that sample in the dataframe?
import pandas as pd
import numpy as np
from bokeh.plotting import figure, output_file, show, ColumnDataSource
from bokeh.models import HoverTool
from bokeh.models.widgets import TextInput
data = pd.DataFrame()
data["x"] = np.random.randn(100)
data["y"] = np.random.randn(100)
data["label"] = "other"
x=data.x.values
y=data.y.values
label=data.label.values
output_file("toolbar.html")
source = ColumnDataSource(
data=dict(
x=x,
y=y,
_class=label,
)
)
hover = HoverTool(
tooltips=[
("index", "$index"),
("(x,y)", "($x, $y)"),
("class", "#_class"),
]
)
p = figure(plot_width=400, plot_height=400, tools=[hover,"lasso_select","crosshair",],
title="Mouse over the dots")
p.circle('x', 'y', size=5, source=source)
show(p)

automatically changing view bounds on resize matlotlib

Is it possible to setup plot to show more data when expanded?
Matplotlib plots scale when resized. To show specific area one can use set_xlim and such on axes. I have an ecg-like plot showing realtime data, its y limits are predefined, but I want to see more data along x if I expand window or just have big monitor.
Im using it in a pyside app and I could just change xlim on resize but I want more clean and generic solution.
One way to do this is to implement a handler for resize_event. Here is a short example how this might be done. You can modify it for your needs:
import numpy as np
import matplotlib.pyplot as plt
def onresize(event):
width = event.width
scale_factor = 100.
data_range = width/scale_factor
start, end = plt.xlim()
new_end = start+data_range
plt.xlim((start, new_end))
if __name__ == "__main__":
fig = plt.figure()
ax = fig.add_subplot(111)
t = np.arange(100)
y = np.random.rand(100)
ax.plot(t,y)
plt.xlim((0, 10))
cid = fig.canvas.mpl_connect('resize_event', onresize)
plt.show()

Matplotlib plot in Tkinter - every update adds new NavigationToolbar?

I am working on a Tkinter-GUI to interactively generate Matplotlib-plots, depending on user-input. For this end, it needs to re-plot after the user changes the input.
I have gotten it to work in principle, but would like to include the NavigationToolbar. However, I cannot seem to get the updating of the NavigationToolbar to work correctly.
Here is a basic working version of the code (without the user input Entries):
# Import modules
from Tkinter import *
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg, NavigationToolbar2TkAgg
# global variable: do we already have a plot displayed?
show_plot = False
# plotting function
def plot(x, y):
fig = plt.figure()
ax1 = fig.add_subplot(1,1,1)
ax1.plot(x,y)
return fig
def main():
x = np.arange(0.0,3.0,0.01)
y = np.sin(2*np.pi*x)
fig = plot(x, y)
canvas = FigureCanvasTkAgg(fig, master=root)
toolbar = NavigationToolbar2TkAgg(canvas, toolbar_frame)
global show_plot
if show_plot:
#remove existing plot and toolbar widgets
canvas.get_tk_widget().grid_forget()
toolbar_frame.grid_forget()
toolbar_frame.grid(row=1,column=1)
canvas.get_tk_widget().grid(row=0,column=1)
show_plot=True
# GUI
root = Tk()
draw_button = Button(root, text="Plot!", command = main)
draw_button.grid(row=0, column=0)
fig = plt.figure()
canvas = FigureCanvasTkAgg(fig, master=root)
canvas.get_tk_widget().grid(row=0,column=1)
toolbar_frame = Frame(root)
toolbar_frame.grid(row=1,column=1)
root.mainloop()
Pressing "Plot!" once generates the plot and the NavigationToolbar.
Pressing it a second time replots, but generates a second NavigationToolbar (and another every time "Plot!" is pressed). Which sounds like grid_forget() is not working.
However, when I change
if show_plot:
#remove existing plot and toolbar widgets
canvas.get_tk_widget().grid_forget()
toolbar_frame.grid_forget()
toolbar_frame.grid(row=1,column=1)
canvas.get_tk_widget().grid(row=0,column=1)
show_plot=True
to
if show_plot:
#remove existing plot and toolbar widgets
canvas.get_tk_widget().grid_forget()
toolbar_frame.grid_forget()
else:
toolbar_frame.grid(row=1,column=1)
canvas.get_tk_widget().grid(row=0,column=1)
show_plot=True
then the NavigationToolbar does vanish when "Plot!" is pressed a second time (but then there is, as expected, no new NavigationToolbar to replace the old). So grid_forget() is working, just not as expected.
What am I doing wrong? Is there a better way to update the NavigationToolbar?
Any help greatly appreciated!
Lastalda
Edit:
I found that this will work if you destroy the NavigationToolbar instead of forgetting it. But you have to completely re-create the widget again afterwards, of course:
canvas = FigureCanvasTkAgg(fig, master=root)
toolbar_frame = Frame(root)
global show_plot
if show_plot: # if plot already present, remove plot and destroy NavigationToolbar
canvas.get_tk_widget().grid_forget()
toolbar_frame.destroy()
toolbar_frame = Frame(root)
toolbar = NavigationToolbar2TkAgg(canvas, toolbar_frame)
toolbar_frame.grid(row=21,column=4,columnspan=3)
canvas.get_tk_widget().grid(row=1,column=4,columnspan=3,rowspan=20)
show_plot = True
However, the updating approach showed by Hans below is much nicer since you don't have to destroy and recreate anything. I just wanted to highlight that the issue with my approach (apart from the inelegance and performance) was probably that I didn't use destroy().
A slightly different approach might be to reuse the figure for subsequent plots by clearing & redrawing it. That way, you don't have to destroy & regenerate neither the figure nor the toolbar:
from Tkinter import Tk, Button
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg, NavigationToolbar2TkAgg
# plotting function: clear current, plot & redraw
def plot(x, y):
plt.clf()
plt.plot(x,y)
# just plt.draw() won't do it here, strangely
plt.gcf().canvas.draw()
# just to see the plot change
plotShift = 0
def main():
global plotShift
x = np.arange(0.0,3.0,0.01)
y = np.sin(2*np.pi*x + plotShift)
plot(x, y)
plotShift += 1
# GUI
root = Tk()
draw_button = Button(root, text="Plot!", command = main)
draw_button.grid(row=0, column=0)
# init figure
fig = plt.figure()
canvas = FigureCanvasTkAgg(fig, master=root)
toolbar = NavigationToolbar2TkAgg(canvas, root)
canvas.get_tk_widget().grid(row=0,column=1)
toolbar.grid(row=1,column=1)
root.mainloop()
When you press the "Plot!" button it calls main. main creates the navigation toolbar. So, each time you press the button you get a toolbar. I don't know much about matplot, but it's pretty obvious why you get multiple toolbars.
By the way, grid_forget doesn't destroy the widget, it simply removes it from view. Even if you don't see it, it's still in memory.
Typically in a GUI, you create all the widgets exactly once rather than recreating the same widgets over and over.