NetworkX, Matplotlib GUI Application - matplotlib

Trying to write an application that repeatedly creates different graphs according to parameters that user inputs by means of a simple GUI.
All graphs together with some text describing them needs to be displayed in a single, one and the same Jupiter cell / window / chart redrawn with every new user input.
The code below is a simple Jupiter illustration of what to do:
import random
import networkx as nx
import matplotlib.pyplot as plt
while input():
i = random.randint(5, 10)
G = nx.complete_graph(i)
nx.draw_networkx(G, with_labels=True)
plt.show()
print("Nodes data: ", G.nodes.data())
print("Connected: ",list(nx.connected_components(G)))
In Jupiter running this loop results in a sequence of text and graph images on output:
Nodes data: [(0, {}), (1, {}), (2, {}), (3, {}), (4, {}), (5, {}), (6, {}), (7, {}), (8, {}), (9, {})]
Connected: [{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}]
Nodes data: [(0, {}), (1, {}), (2, {}), (3, {}), (4, {})]
Connected: [{0, 1, 2, 3, 4}]
And so on ...
Is it possible to redraw NetworkX + Matplotlib graphs and accompanying text in a single Jupiter cell or window of some other Python GUI framework?
Jupiter is not an absolute requirement for me, any other Python GUI will do also.
Thanks!

This is definitely possible (assuming I understood what you want to do correctly).
I would recommend using ipywidgets to handle the GUI part.
See example below:
import networkx as nx
import matplotlib.pyplot as plt
import ipywidgets as widgets
N_nodes_widget=widgets.IntSlider(min=1,max=100,value=5,step=1,continuous_update=False) #simple integer slider
def graph_generation(N_nodes):
G = nx.complete_graph(N_nodes)
nx.draw_networkx(G, with_labels=True)
print("Nodes data: ", G.nodes.data())
print("Connected: ",list(nx.connected_components(G)))
plt.show()
widgets.interact(graph_generation,N_nodes=N_nodes_widget); #interact function links slider to the graph generation function
Edit:
In case, you want to keep the previous iterations of your graphs, you can store your graphs in a list and create a new figure of size (1,N_graphs) everytime the user interact with the widget. The information about the graph can be displayed in the title of the subplots. See example below:
import networkx as nx
import matplotlib.pyplot as plt
import ipywidgets as widgets
N_nodes_widget=widgets.IntSlider(min=1,max=30,value=5,step=1,continuous_update=False)
list_plots=[] #create empty list of graphs
def graph_generation(N_nodes):
G = nx.complete_graph(N_nodes)
list_plots.append(G)
fig,axs=plt.subplots(nrows=1,ncols=len(list_plots)) #create figure with N_graphs subplots
for i, ax in enumerate(fig.axes):
nx.draw_networkx(G, with_labels=True,ax=ax)
ax.set_title("Nodes data:\n "+str(G.nodes.data()) + "\nConnected:"+str(list(nx.connected_components(G))),fontsize=6) #use title to print graph info
plt.tight_layout()
plt.show()
widgets.interact(graph_generation,N_nodes=N_nodes_widget);

Related

(matplotlib)Avoiding AnchoredText overlapping

My goal is to draw scatter plot with AnchoredText. The graph (1) below is as expected.
But the graph (2) shows the overlapped AnchoredTexts. How to make it non overlapped texts?
My data and function for drawing is:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.offsetbox import AnchoredText
data=pd.DataFrame({'val':[1, 1, 2, 2.1, 2, 2.5, 2.3],'site':['a','a','a','b','b','b','b'],
'X2':[4, 5 ,6 ,10, 10, 11, 11], 'X3':[100,100,200,200,200,300,300],
'applydate':[1101,1102,1201,1202,1204,1204,1204],
'X1':['b','b','h','b','b','h','h'] })
def my_scatter(x,y, **kwargs):
plt.scatter(x=x, y=y,**kwargs)
mx = np.mean(x);my= np.mean(y);
strText='AVG_H:'+str(np.round(mx,3))+'\nAVG_V:'+str(np.round(my,3))
textbox = AnchoredText(strText, loc='lower right', frameon=False,)
plt.gca().add_artist(textbox)
The graph (1), which is ok.
g = sns.FacetGrid(data,col='site',height=3)
g.map(my_scatter, "X2", "val",s=100, alpha=.5)
g.add_legend()
And the graph (2), which shows overlapped texts.(※The problem is that statistics for blue dots and orage dots are overlapped)
g = sns.FacetGrid(data,col='site',height=3, hue='X1') # I've assigned column X1 to hue variable.
g.map(my_scatter, "X2", "val",s=100, alpha=.5)
g.add_legend()

How can I use matplotlib.pyplot to customize geopandas plots?

What is the difference between geopandas plots and matplotlib plots? Why are not all keywords available?
In geopandas there is markersize, but not markeredgecolor...
In the example below I plot a pandas df with some styling, then transform the pandas df to a geopandas df. Simple plotting is working, but no additional styling.
This is just an example. In my geopandas plots I would like to customize, markers, legends, etc. How can I access the relevant matplotlib objects?
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
X = np.linspace(-6, 6, 1024)
Y = np.sinc(X)
df = pd.DataFrame(Y, X)
plt.plot(X,Y,linewidth = 3., color = 'k', markersize = 9, markeredgewidth = 1.5, markerfacecolor = '.75', markeredgecolor = 'k', marker = 'o', markevery = 32)
# alternatively:
# df.plot(linewidth = 3., color = 'k', markersize = 9, markeredgewidth = 1.5, markerfacecolor = '.75', markeredgecolor = 'k', marker = 'o', markevery = 32)
plt.show()
# create GeoDataFrame from df
df.reset_index(inplace=True)
df.rename(columns={'index': 'Y', 0: 'X'}, inplace=True)
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df['Y'], df['X']))
gdf.plot(linewidth = 3., color = 'k', markersize = 9) # working
gdf.plot(linewidth = 3., color = 'k', markersize = 9, markeredgecolor = 'k') # not working
plt.show()
You're probably confused by the fact that both libraries named the method .plot(. In matplotlib that specifically translates to a mpl.lines.Line2D object, which also contains the markers and their styling.
Geopandas, assumes you want to plot geographic data, and uses a Path for this (mpl.collections.PathCollection). That has for example the face and edgecolors, but no markers. The facecolor comes into play whenever your path closes and forms a polygon (your example doesn't, making it "just" a line).
Geopandas seems to use a bit of a trick for points/markers, it appears to draw a "path" using the "CURVE4" code (cubic Bézier).
You can explore what's happening if you capture the axes that geopandas returns:
ax = gdf.plot(...
Using ax.get_children() you'll get all artists that have been added to the axes, since this is a simple plot, it's easy to see that the PathCollection is the actual data. The other artists are drawing the axis/spines etc.
[<matplotlib.collections.PathCollection at 0x1c05d5879d0>,
<matplotlib.spines.Spine at 0x1c05d43c5b0>,
<matplotlib.spines.Spine at 0x1c05d43c4f0>,
<matplotlib.spines.Spine at 0x1c05d43c9d0>,
<matplotlib.spines.Spine at 0x1c05d43f1c0>,
<matplotlib.axis.XAxis at 0x1c05d036590>,
<matplotlib.axis.YAxis at 0x1c05d43ea10>,
Text(0.5, 1.0, ''),
Text(0.0, 1.0, ''),
Text(1.0, 1.0, ''),
<matplotlib.patches.Rectangle at 0x1c05d351b10>]
If you reduce the amount of points a lot, like use 5 instead of 1024, retrieving the Path's drawn show the coordinates and also the codes used:
pcoll = ax.get_children()[0] # the first artist is the PathCollection
path = pcoll.get_paths()[0] # it only contains 1 Path
print(path.codes) # show the codes used.
# array([ 1, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
# 4, 4, 4, 4, 4, 4, 4, 4, 79], dtype=uint8)
Some more info about how these paths work can be found at:
https://matplotlib.org/stable/tutorials/advanced/path_tutorial.html
So long story short, you do have all the same keywords as when using Matplotlib, but they're the keywords for Path's and not the Line2D object that you might expect.
You can always flip the order around, and start with a Matplotlib figure/axes created by you, and pass that axes to Geopandas when you want to plot something. That might make it easier or more intuitive when you (also) want to plot other things in the same axes. It does require perhaps a bit more discipline to make sure the (spatial)coordinates etc match.
I personally almost always do that, because it allows to do most of the plotting using the same Matplotlib API's. Which admittedly has perhaps a slightly steeper learning curve. But overall I find it easier compared to having to deal with every package's slightly different interpretation that uses Matplotlib under the hood (eg geopandas, seaborn, xarray etc). But that really depends on where you're coming from.
Thank you for your detailed answer. Based on this I came up with this simplified code from my real project.
I have a shapefile shp and some point data df which I want to plot. shp is plotted with geopandas, df with matplotlib.plt. No need for transferring the point data into a geodataframe gdf as I did initially.
# read marker data (places with coordindates)
df = pd.read_csv("../obese_pct_by_place.csv")
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df['sweref99_lng'], df['sweref99_lat']))
# read shapefile
shp = gpd.read_file("../../SWEREF_Shapefiles/KommunSweref99TM/Kommun_Sweref99TM_region.shp")
fig, ax = plt.subplots(figsize=(10, 8))
ax.set_aspect('equal')
shp.plot(ax=ax)
# plot obesity markers
# geopandas, no edgecolor here
# gdf.plot(ax=ax, marker='o', c='r', markersize=gdf['obese'] * 25)
# matplotlib.pyplot with edgecolor
plt.scatter(df['sweref99_lng'], df['sweref99_lat'], c='r', edgecolor='k', s=df['obese'] * 25)
plt.show()

Distinct color dots on Multidimensional Scaling plot (MDS) with plt.annotate()

Currently my MDS looks like this Instead of having the numberings on the MDS plot, I would like to replace them with dots. My desired output is that the point that is annotated as 'highest' will be a blue dot, the point that is annotated with 'lowest' will be a red dot, and all the other points will be grey dots.
Code to reproduce the MDS plot above.
import numpy as np
import scipy
import matplotlib.pyplot as plt
from sklearn.metrics import pairwise_distances #jaccard diss.
from sklearn import manifold # multidimensional scaling
foods_binary = np.random.randint(2, size=(100, 10)) #initial dataset
print(foods_binary.shape)
dis_matrix = pairwise_distances(foods_binary, metric = 'jaccard')
mds_model = manifold.MDS(n_components = 2, random_state = 123,
dissimilarity = 'precomputed')
mds_fit = mds_model.fit(dis_matrix)
mds_coords = mds_model.fit_transform(dis_matrix)
plt.figure()
plt.scatter(mds_coords[:,0],mds_coords[:,1],
facecolors = 'none', edgecolors = 'none') # points in white (invisible)
labels = [ 1, 2, 3, 4, 5, 'highest', 5, 6, 7, 8, 9, 'lowest']
for label, x, y in zip(labels, mds_coords[:,0], mds_coords[:,1]):
plt.annotate(label, (x,y), xycoords = 'data')
plt.xlabel('First Dimension')
plt.ylabel('Second Dimension')
plt.title('Dissimilarity among food items')
plt.show()
How can I achieve what I desire above? Thanks for any suggestions.

How to start Seaborn Logarithmic Barplot at y=1

I have a problem figuring out how to have Seaborn show the right values in a logarithmic barplot. A value of mine should be, in the ideal case, be 1. My dataseries (5,2,1,0.5,0.2) has a set of values that deviate from unity and I want to visualize these in a logarithmic barplot. However, when plotting this in the standard log-barplot it shows the following:
But the values under one are shown to increase from -infinity to their value, whilst the real values ought to look like this:
Strangely enough, I was unable to find a Seaborn, Pandas or Matplotlib attribute to "snap" to a different horizontal axis or "align" or ymin/ymax. I have a feeling I am unable to find it because I can't find the terms to shove down my favorite search engine. Some semi-solutions I found just did not match what I was looking for or did not have either xaxis = 1 or a ylog. A try that uses some jank Matplotlib lines:
If someone knows the right terms or a solution, thank you in advance.
Here are the Jupyter cells I used:
{1}
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
data = {'X': ['A','B','C','D','E'], 'Y': [5,2,1,0.5,0.2]}
df = pd.DataFrame(data)
{2}
%matplotlib widget
g = sns.catplot(data=df, kind="bar", y = "Y", x = "X", log = True)
{3}
%matplotlib widget
plt.vlines(x=data['X'], ymin=1, ymax=data['Y'])
You could let the bars start at 1 instead of at 0. You'll need to use sns.barplot directly.
The example code subtracts 1 of all y-values and sets the bar bottom at 1.
import matplotlib.pyplot as plt
from matplotlib.ticker import NullFormatter
import seaborn as sns
import pandas as pd
import numpy as np
data = {'X': ['A', 'B', 'C', 'D', 'E'], 'Y': [5, 2, 1, 0.5, 0.2]}
df = pd.DataFrame(data)
ax = sns.barplot(y=df["Y"] - 1, x=df["X"], bottom=1, log=True, palette='flare_r')
ax.axhline(y=1, c='k')
# change the y-ticks, as the default shows too few in this case
ax.set_yticks(np.append(np.arange(.2, .8, .1), np.arange(1, 7, 1)), minor=False)
ax.set_yticks(np.arange(.3, 6, .1), minor=True)
ax.yaxis.set_major_formatter(lambda x, pos: f'{x:.0f}' if x >= 1 else f'{x:.1f}')
ax.yaxis.set_minor_formatter(NullFormatter())
ax.bar_label(ax.containers[0], labels=df["Y"])
sns.despine()
plt.show()
PS: With these specific values, the plot might go without logscale:

Not able to add 'map.drawcoastline' to 3d figure using 'ax.add_collection3d(map.drawcoastlines())'

So I want to plot a 3d map using matplotlib basemap. But an error message comes popping up.
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from mpl_toolkits.basemap import Basemap
from matplotlib.collections import PolyCollection
import numpy as np
map = Basemap(llcrnrlon=-20,llcrnrlat=0,urcrnrlon=15,urcrnrlat=50,)
fig = plt.figure()
ax = Axes3D(fig)
#ax.set_axis_off()
ax.azim = 270
ax.dist = 7
polys = []
for polygon in map.landpolygons:
polys.append(polygon.get_coords())
lc=PolyCollection(polys,edgecolor='black',facecolor='#DDDDDD',closed=False)
ax.add_collection3d(lc)
ax.add_collection3d(map.drawcoastlines(linewidth=0.25))
ax.add_collection3d(map.drawcountries(linewidth=0.35))
lons = np.array([-13.7, -10.8, -13.2, -96.8, -7.99, 7.5, -17.3, -3.7])
lats = np.array([9.6, 6.3, 8.5, 32.7, 12.5, 8.9, 14.7, 40.39])
cases = np.array([1971, 7069, 6073, 4, 6, 20, 1, 1])
deaths = np.array([1192, 2964, 1250, 1, 5, 8, 0, 0])
places = np.array(['Guinea', 'Liberia', 'Sierra Leone','United States','Mali','Nigeria', 'Senegal', 'Spain'])
x, y = map(lons, lats)
ax.bar3d(x, y, np.zeros(len(x)), 2, 2, deaths, color= 'r', alpha=0.8)
plt.show()
I got an error message on line 21{i.e ax.add_collection3d(map.drawcoastlines(linewidth=0.25))} saying:-
'It is not currently possible to manually set the aspect '
NotImplementedError: It is not currently possible to manually set the aspect on 3D axes'
I found this question because I had the exact question.
I later chanced upon some documentation that revealed the workaround - if setting of aspect is not implemented, then let's not set it by setting fix_aspect to false:
map = Basemap(fix_aspect=False)
EDIT:
I suppose I should add a little more to my previous answer to make it a little easier to understand what to do.
The NotImplementedError is a deliberate addition by the matplotlib team, as can be seen here. What the error is saying is that we are trying to fix the aspect ratio of the plot, but this is not implemented in 3d plots.
This error occurs when using mpl_toolkits.basemap() with 3d plots as it sets fix_aspect=True by default.
Therefore, to do away with the NotImplementedError, one can consider adding fix_aspect=False when calling mpl_toolkits.basemap(). For example:
map = Basemap(llcrnrlon=-20,llcrnrlat=0,urcrnrlon=15,urcrnrlat=50,fix_aspect=False)