Debugging interactive matplotlib figures in Jupyter notebooks - matplotlib

The example below should highlight the point that you click on, and change the title of the graph to show the label associated with that point.
If I run this Python script as a script, when I click on a point I will get an error " line 15, in onpick
TypeError: only integer scalar arrays can be converted to a scalar index", which is expected. event.ind is a list, and I need to change that to ind = event.ind[0] to be correct here.
However, when I run this in a Jupyter notebook, the figure appears, but the error is silently ignored, so it just appears that the code does not work. Is there a way to get Jupyter to show me that an error has occurred?
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
x = [0, 1, 2, 3, 4, 5]
labels = ['a', 'b', 'c', 'd', 'e', 'f']
ax.plot(x, 'bo', picker=5)
# this is the transparent marker for the selected data point
marker, = ax.plot([0], [0], 'yo', visible=False, alpha=0.8, ms=15)
def onpick(event):
ind = event.ind
ax.set_title('Data point {0} is labeled "{1}"'.format(ind, labels[ind]))
marker.set_visible(True)
marker.set_xdata(x[ind])
marker.set_ydata(x[ind])
ax.figure.canvas.draw() # this line is critical to change the linewidth
fig.canvas.mpl_connect('pick_event', onpick)
plt.show()

Related

How can I use matplotlib.pyplot to customize geopandas plots?

What is the difference between geopandas plots and matplotlib plots? Why are not all keywords available?
In geopandas there is markersize, but not markeredgecolor...
In the example below I plot a pandas df with some styling, then transform the pandas df to a geopandas df. Simple plotting is working, but no additional styling.
This is just an example. In my geopandas plots I would like to customize, markers, legends, etc. How can I access the relevant matplotlib objects?
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
X = np.linspace(-6, 6, 1024)
Y = np.sinc(X)
df = pd.DataFrame(Y, X)
plt.plot(X,Y,linewidth = 3., color = 'k', markersize = 9, markeredgewidth = 1.5, markerfacecolor = '.75', markeredgecolor = 'k', marker = 'o', markevery = 32)
# alternatively:
# df.plot(linewidth = 3., color = 'k', markersize = 9, markeredgewidth = 1.5, markerfacecolor = '.75', markeredgecolor = 'k', marker = 'o', markevery = 32)
plt.show()
# create GeoDataFrame from df
df.reset_index(inplace=True)
df.rename(columns={'index': 'Y', 0: 'X'}, inplace=True)
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df['Y'], df['X']))
gdf.plot(linewidth = 3., color = 'k', markersize = 9) # working
gdf.plot(linewidth = 3., color = 'k', markersize = 9, markeredgecolor = 'k') # not working
plt.show()
You're probably confused by the fact that both libraries named the method .plot(. In matplotlib that specifically translates to a mpl.lines.Line2D object, which also contains the markers and their styling.
Geopandas, assumes you want to plot geographic data, and uses a Path for this (mpl.collections.PathCollection). That has for example the face and edgecolors, but no markers. The facecolor comes into play whenever your path closes and forms a polygon (your example doesn't, making it "just" a line).
Geopandas seems to use a bit of a trick for points/markers, it appears to draw a "path" using the "CURVE4" code (cubic Bézier).
You can explore what's happening if you capture the axes that geopandas returns:
ax = gdf.plot(...
Using ax.get_children() you'll get all artists that have been added to the axes, since this is a simple plot, it's easy to see that the PathCollection is the actual data. The other artists are drawing the axis/spines etc.
[<matplotlib.collections.PathCollection at 0x1c05d5879d0>,
<matplotlib.spines.Spine at 0x1c05d43c5b0>,
<matplotlib.spines.Spine at 0x1c05d43c4f0>,
<matplotlib.spines.Spine at 0x1c05d43c9d0>,
<matplotlib.spines.Spine at 0x1c05d43f1c0>,
<matplotlib.axis.XAxis at 0x1c05d036590>,
<matplotlib.axis.YAxis at 0x1c05d43ea10>,
Text(0.5, 1.0, ''),
Text(0.0, 1.0, ''),
Text(1.0, 1.0, ''),
<matplotlib.patches.Rectangle at 0x1c05d351b10>]
If you reduce the amount of points a lot, like use 5 instead of 1024, retrieving the Path's drawn show the coordinates and also the codes used:
pcoll = ax.get_children()[0] # the first artist is the PathCollection
path = pcoll.get_paths()[0] # it only contains 1 Path
print(path.codes) # show the codes used.
# array([ 1, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
# 4, 4, 4, 4, 4, 4, 4, 4, 79], dtype=uint8)
Some more info about how these paths work can be found at:
https://matplotlib.org/stable/tutorials/advanced/path_tutorial.html
So long story short, you do have all the same keywords as when using Matplotlib, but they're the keywords for Path's and not the Line2D object that you might expect.
You can always flip the order around, and start with a Matplotlib figure/axes created by you, and pass that axes to Geopandas when you want to plot something. That might make it easier or more intuitive when you (also) want to plot other things in the same axes. It does require perhaps a bit more discipline to make sure the (spatial)coordinates etc match.
I personally almost always do that, because it allows to do most of the plotting using the same Matplotlib API's. Which admittedly has perhaps a slightly steeper learning curve. But overall I find it easier compared to having to deal with every package's slightly different interpretation that uses Matplotlib under the hood (eg geopandas, seaborn, xarray etc). But that really depends on where you're coming from.
Thank you for your detailed answer. Based on this I came up with this simplified code from my real project.
I have a shapefile shp and some point data df which I want to plot. shp is plotted with geopandas, df with matplotlib.plt. No need for transferring the point data into a geodataframe gdf as I did initially.
# read marker data (places with coordindates)
df = pd.read_csv("../obese_pct_by_place.csv")
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df['sweref99_lng'], df['sweref99_lat']))
# read shapefile
shp = gpd.read_file("../../SWEREF_Shapefiles/KommunSweref99TM/Kommun_Sweref99TM_region.shp")
fig, ax = plt.subplots(figsize=(10, 8))
ax.set_aspect('equal')
shp.plot(ax=ax)
# plot obesity markers
# geopandas, no edgecolor here
# gdf.plot(ax=ax, marker='o', c='r', markersize=gdf['obese'] * 25)
# matplotlib.pyplot with edgecolor
plt.scatter(df['sweref99_lng'], df['sweref99_lat'], c='r', edgecolor='k', s=df['obese'] * 25)
plt.show()

mouse-over only on actual data points

Here's a really simple line chart.
%matplotlib notebook
import matplotlib.pyplot as plt
lines = plt.plot([1, 2, 3, 4], [1, 4, 9, 16])
plt.setp(lines,marker='D')
plt.ylabel('foo')
plt.xlabel('bar')
plt.show()
If I move my mouse over the chart, I get the x and y values for wherever the pointer is. Is there any way to only get values only when I'm actually over a data point?
I understood you wanted to modify the behavior of the coordinates displayed in the status bar at the bottom right of the plot, is that right?
If so, you can "hijack" the Axes.format_coord() function to make it display whatever you want. You can see an example of this on matplotlib's example gallery.
In your case, something like this seem to do the trick?
my_x = np.array([1, 2, 3, 4])
my_y = np.array([1, 4, 9, 16])
eps = 0.1
def format_coord(x, y):
close_x = np.isclose(my_x, x, atol=eps)
close_y = np.isclose(my_y, y, atol=eps)
if np.any(close_x) and np.any(close_y):
return 'x=%s y=%s' % (ax.format_xdata(my_x[close_x]), ax.format_ydata(my_y[close_y]))
else:
return ''
fig, ax = plt.subplots()
ax.plot(my_x, my_y, 'D-')
ax.set_ylabel('foo')
ax.set_xlabel('bar')
ax.format_coord = format_coord
plt.show()

Not able to add 'map.drawcoastline' to 3d figure using 'ax.add_collection3d(map.drawcoastlines())'

So I want to plot a 3d map using matplotlib basemap. But an error message comes popping up.
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from mpl_toolkits.basemap import Basemap
from matplotlib.collections import PolyCollection
import numpy as np
map = Basemap(llcrnrlon=-20,llcrnrlat=0,urcrnrlon=15,urcrnrlat=50,)
fig = plt.figure()
ax = Axes3D(fig)
#ax.set_axis_off()
ax.azim = 270
ax.dist = 7
polys = []
for polygon in map.landpolygons:
polys.append(polygon.get_coords())
lc=PolyCollection(polys,edgecolor='black',facecolor='#DDDDDD',closed=False)
ax.add_collection3d(lc)
ax.add_collection3d(map.drawcoastlines(linewidth=0.25))
ax.add_collection3d(map.drawcountries(linewidth=0.35))
lons = np.array([-13.7, -10.8, -13.2, -96.8, -7.99, 7.5, -17.3, -3.7])
lats = np.array([9.6, 6.3, 8.5, 32.7, 12.5, 8.9, 14.7, 40.39])
cases = np.array([1971, 7069, 6073, 4, 6, 20, 1, 1])
deaths = np.array([1192, 2964, 1250, 1, 5, 8, 0, 0])
places = np.array(['Guinea', 'Liberia', 'Sierra Leone','United States','Mali','Nigeria', 'Senegal', 'Spain'])
x, y = map(lons, lats)
ax.bar3d(x, y, np.zeros(len(x)), 2, 2, deaths, color= 'r', alpha=0.8)
plt.show()
I got an error message on line 21{i.e ax.add_collection3d(map.drawcoastlines(linewidth=0.25))} saying:-
'It is not currently possible to manually set the aspect '
NotImplementedError: It is not currently possible to manually set the aspect on 3D axes'
I found this question because I had the exact question.
I later chanced upon some documentation that revealed the workaround - if setting of aspect is not implemented, then let's not set it by setting fix_aspect to false:
map = Basemap(fix_aspect=False)
EDIT:
I suppose I should add a little more to my previous answer to make it a little easier to understand what to do.
The NotImplementedError is a deliberate addition by the matplotlib team, as can be seen here. What the error is saying is that we are trying to fix the aspect ratio of the plot, but this is not implemented in 3d plots.
This error occurs when using mpl_toolkits.basemap() with 3d plots as it sets fix_aspect=True by default.
Therefore, to do away with the NotImplementedError, one can consider adding fix_aspect=False when calling mpl_toolkits.basemap(). For example:
map = Basemap(llcrnrlon=-20,llcrnrlat=0,urcrnrlon=15,urcrnrlat=50,fix_aspect=False)

How to plot annotated heatmap on plotly?

I'm trying to make a annotated heatmap on plotly.
import plotly.plotly as py
import plotly.tools as tls
from plotly.graph_objs import *
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('masterc.csv')
locations = {}
anno = []
for i in range(df.shape[0]):
locations.setdefault((df.iat[i,2],df.iat[i,6]),0)
locations[(df.iat[i,2],df.iat[i,6])]+=df.iat[i,8]
x1 = []
y1 = []
z1 = []
z1_text = []
for key in locations.keys():
if key[0] not in x1:
x1 += [key[0],]
if key[1] not in y1:
y1 += [key[1],]
for y in y1:
dummy = []
for x in x1:
if (x,y) in locations.keys():
dummy += [locations[(x,y)],]
else:
dummy += [0,]
z1 += [dummy,]
data = z1
arr = np.array(data)
fig, ax = plt.subplots()
ax.imshow(data, cmap='seismic')
for (i, j), z in np.ndenumerate(data):
ax.text(j, i, '{:f}'.format(z), ha='center', va='center')
ax.set_xticklabels(x1, rotation=90)
ax.set_yticklabels(y1)
#plt.show()
py.plot_mpl(fig)
I'm getting the following warning
Warning (from warnings module):
File "C:\Python27\lib\site-packages\plotly\matplotlylib\renderer.py", line 394
warnings.warn("Aw. Snap! You're gonna have to hold off on "
UserWarning: Aw. Snap! You're gonna have to hold off on the selfies for now. Plotly can't import images from matplotlib yet!
and finally the following error
Traceback (most recent call last):
File "E:\Project Kumbh\heatmap with annotations.py", line 58, in <module>
py.plot_mpl(fig)
File "C:\Python27\lib\site-packages\plotly\plotly\plotly.py", line 261, in plot_mpl
return plot(fig, **plot_options)
File "C:\Python27\lib\site-packages\plotly\plotly\plotly.py", line 155, in plot
figure = tools.return_figure_from_figure_or_data(figure_or_data, validate)
File "C:\Python27\lib\site-packages\plotly\tools.py", line 1409, in return_figure_from_figure_or_data
if not figure['data']:
KeyError: 'data'
Is there anyway to get around this error? Or is there any simple way to make an annotated heatmap on plotly?
Edit
It's now possible to do it easily with plotly.figure_factory:
https://plot.ly/python/annotated_heatmap/
As far as I know, it is still not possible to convert Matplotlib's heatmaps into Plotly's though.
Aug 2015 Answer
Here's an example of making an annotated heatmap with the python api:
import plotly.plotly as py
import plotly.graph_objs as go
x = ['A', 'B', 'C', 'D', 'E']
y = ['W', 'X', 'Y', 'Z']
# x0 x1 x2 x3 x4
z = [[0.00, 0.00, 0.75, 0.75, 0.00], # y0
[0.00, 0.00, 0.75, 0.75, 0.00], # y1
[0.75, 0.75, 0.75, 0.75, 0.75], # y2
[0.00, 0.00, 0.00, 0.75, 0.00]] # y3
annotations = go.Annotations()
for n, row in enumerate(z):
for m, val in enumerate(row):
annotations.append(go.Annotation(text=str(z[n][m]), x=x[m], y=y[n],
xref='x1', yref='y1', showarrow=False))
colorscale = [[0, '#3D9970'], [1, '#001f3f']] # custom colorscale
trace = go.Heatmap(x=x, y=y, z=z, colorscale=colorscale, showscale=False)
fig = go.Figure(data=go.Data([trace]))
fig['layout'].update(
title="Annotated Heatmap",
annotations=annotations,
xaxis=go.XAxis(ticks='', side='top'),
yaxis=go.YAxis(ticks='', ticksuffix=' '), # ticksuffix is a workaround to add a bit of padding
width=700,
height=700,
autosize=False
)
print py.plot(fig, filename='Stack Overflow 31756636', auto_open=False) # https://plot.ly/~theengineear/5179
With the result at https://plot.ly/~theengineear/5179
Linking a related GitHub issue: https://github.com/plotly/python-api/issues/273
You have to use Plotly's declarative syntax, instead of converting from matplotlib to Python. Plotly only supports the matplotlib figure objects that it can reverse engineer, and unfortunately heatmaps aren't one of them. Here are the Plotly Python heatmap docs:
https://plot.ly/python/heatmaps/
And here are the Plotly Python annotation docs:
https://plot.ly/python/text-and-annotations/
Make sure to set the annotations to be referenced to the data rather than the page.
You could also overlay a scatter plot with a hover text field on the heatmap, but set the mode of the scatter plot to text. This would make only the text show and not the scatter plot points. Docs:
https://plot.ly/python/text-and-annotations/

Matplotlib: combining two bar charts

I'm trying to generate 'violin'-like bar charts, however i'm running in several difficulties described bellow...
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
# init data
label = ['aa', 'b', 'cc', 'd']
data1 = [5, 7, 6, 9]
data2 = [7, 3, 6, 1]
data1_minus = np.array(data1)*-1
gs = gridspec.GridSpec(1, 2, top=0.95, bottom=0.07,)
fig = plt.figure(figsize=(7.5, 4.0))
# adding left bar chart
ax1 = fig.add_subplot(gs[0])
ax1.barh(pos, data1_minus)
ax1.yaxis.tick_right()
ax1.yaxis.set_label(label)
# adding right bar chart
ax2 = fig.add_subplot(gs[1], sharey=ax1)
ax2.barh(pos, data2)
Trouble adding 'label' as labels for both charts to share.
Centering the labels between the both plots (as well as vertically in the center of each bar)
Keeping just the ticks on the outer yaxis (not inner, where the labels would go)
If I understand the question correctly, I believe these changes accomplish what you're looking for:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
# init data
label = ['aa', 'b', 'cc', 'd']
data1 = [5, 7, 6, 9]
data2 = [7, 3, 6, 1]
data1_minus = np.array(data1)*-1
gs = gridspec.GridSpec(1, 2, top=0.95, bottom=0.07,)
fig = plt.figure(figsize=(7.5, 4.0))
pos = np.arange(4)
# adding left bar chart
ax1 = fig.add_subplot(gs[0])
ax1.barh(pos, data1_minus, align='center')
# set tick positions and labels appropriately
ax1.yaxis.tick_right()
ax1.set_yticks(pos)
ax1.set_yticklabels(label)
ax1.tick_params(axis='y', pad=15)
# adding right bar chart
ax2 = fig.add_subplot(gs[1], sharey=ax1)
ax2.barh(pos, data2, align='center')
# turn off the second axis tick labels without disturbing the originals
[lbl.set_visible(False) for lbl in ax2.get_yticklabels()]
plt.show()
This yields this plot:
As for keeping the actual numerical ticks (if you want those), the normal matplotlib interface ties the ticks pretty closely together when the axes are shared (or twinned). However, the axes_grid1 toolkit can allow you more control, so if you want some numerical ticks you can replace the entire ax2 section above with the following:
from mpl_toolkits.axes_grid1 import host_subplot
ax2 = host_subplot(gs[1], sharey=ax1)
ax2.barh(pos, data2, align='center')
par = ax2.twin()
par.set_xticklabels('')
par.set_yticks(pos)
par.set_yticklabels([str(x) for x in pos])
[lbl.set_visible(False) for lbl in ax2.get_yticklabels()]
which yields: