How to add markers on legend and graph - matplotlib - matplotlib

I have the following code:
from matplotlib import pyplot as plt
import seaborn as sns
fig = plt.figure()
fig.suptitle('Average GPA and Standard Deviation per course combination', fontsize=15)
plt.xlabel('Standard Deviation of Average GPA', fontsize=12)
plt.ylabel('Average GPA', fontsize=12)
colors = ['#E74C3C', '#76448A', '#3498DB', '#17A589', '#F1C40F', '#F39C12', '#CA6F1E', '#B3B6B7', '#34495E',
'#F5B7B1']
marker = ['.','v','^','1','2','8','p','P','x','X']
g = sns.scatterplot(x=all_stdev, y=all_gpas, hue=final_courses)
g.legend(loc='center left', bbox_to_anchor=(1, 0.5), ncol=1)
plt.show()
the all_stdev, all_gpas, final_courses are lists and change everytime based on the user, since these are recommendations for the user based on input. The result I get is the following for a particular student:
I tried putting markers in order for them to be more easy to understand from the user(the results) but no matter the things I tried I did not manage to do it. The markers would appear in the graph with same color and the legend would still have all colors as shown above. I need to add the markers to the graph and legend as well. Is there a way to do it? I have a list with markers that I would like to use in the code provided.

You need to add the markers and the colors as a parameter to the scatterplot.
There still is another problem with the markers. Seaborn complains: Filled and line art markers cannot be mixed. So you need to select either filled or line art markers.
from matplotlib import pyplot as plt
import seaborn as sns
import numpy as np
fig = plt.figure()
fig.suptitle('Average GPA and Standard Deviation per course combination', fontsize=15)
plt.xlabel('Standard Deviation of Average GPA', fontsize=12)
plt.ylabel('Average GPA', fontsize=12)
colors = ['#E74C3C', '#76448A', '#3498DB', '#17A589', '#F1C40F', '#F39C12', '#CA6F1E', '#B3B6B7', '#34495E', '#F5B7B1']
# marker = ['.', 'v', '^', '1', '2', '8', 'p', 'P', 'x', 'X']
marker = ['o', 'v', '^', '8', '*', 'P', 'D', 'X', 's', 'p']
N = 30
final_courses = np.random.randint(1,11, N) * 10
all_stdev = np.random.uniform(0, 2, N)
all_gpas = np.random.uniform(3, 4, N)
g = sns.scatterplot(x=all_stdev, y=all_gpas, hue=final_courses, style=final_courses, palette=colors, markers=marker)
g.legend(loc='center left', bbox_to_anchor=(1, 0.5), ncol=1)
plt.show()

Related

How can I use matplotlib.pyplot to customize geopandas plots?

What is the difference between geopandas plots and matplotlib plots? Why are not all keywords available?
In geopandas there is markersize, but not markeredgecolor...
In the example below I plot a pandas df with some styling, then transform the pandas df to a geopandas df. Simple plotting is working, but no additional styling.
This is just an example. In my geopandas plots I would like to customize, markers, legends, etc. How can I access the relevant matplotlib objects?
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
X = np.linspace(-6, 6, 1024)
Y = np.sinc(X)
df = pd.DataFrame(Y, X)
plt.plot(X,Y,linewidth = 3., color = 'k', markersize = 9, markeredgewidth = 1.5, markerfacecolor = '.75', markeredgecolor = 'k', marker = 'o', markevery = 32)
# alternatively:
# df.plot(linewidth = 3., color = 'k', markersize = 9, markeredgewidth = 1.5, markerfacecolor = '.75', markeredgecolor = 'k', marker = 'o', markevery = 32)
plt.show()
# create GeoDataFrame from df
df.reset_index(inplace=True)
df.rename(columns={'index': 'Y', 0: 'X'}, inplace=True)
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df['Y'], df['X']))
gdf.plot(linewidth = 3., color = 'k', markersize = 9) # working
gdf.plot(linewidth = 3., color = 'k', markersize = 9, markeredgecolor = 'k') # not working
plt.show()
You're probably confused by the fact that both libraries named the method .plot(. In matplotlib that specifically translates to a mpl.lines.Line2D object, which also contains the markers and their styling.
Geopandas, assumes you want to plot geographic data, and uses a Path for this (mpl.collections.PathCollection). That has for example the face and edgecolors, but no markers. The facecolor comes into play whenever your path closes and forms a polygon (your example doesn't, making it "just" a line).
Geopandas seems to use a bit of a trick for points/markers, it appears to draw a "path" using the "CURVE4" code (cubic Bézier).
You can explore what's happening if you capture the axes that geopandas returns:
ax = gdf.plot(...
Using ax.get_children() you'll get all artists that have been added to the axes, since this is a simple plot, it's easy to see that the PathCollection is the actual data. The other artists are drawing the axis/spines etc.
[<matplotlib.collections.PathCollection at 0x1c05d5879d0>,
<matplotlib.spines.Spine at 0x1c05d43c5b0>,
<matplotlib.spines.Spine at 0x1c05d43c4f0>,
<matplotlib.spines.Spine at 0x1c05d43c9d0>,
<matplotlib.spines.Spine at 0x1c05d43f1c0>,
<matplotlib.axis.XAxis at 0x1c05d036590>,
<matplotlib.axis.YAxis at 0x1c05d43ea10>,
Text(0.5, 1.0, ''),
Text(0.0, 1.0, ''),
Text(1.0, 1.0, ''),
<matplotlib.patches.Rectangle at 0x1c05d351b10>]
If you reduce the amount of points a lot, like use 5 instead of 1024, retrieving the Path's drawn show the coordinates and also the codes used:
pcoll = ax.get_children()[0] # the first artist is the PathCollection
path = pcoll.get_paths()[0] # it only contains 1 Path
print(path.codes) # show the codes used.
# array([ 1, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
# 4, 4, 4, 4, 4, 4, 4, 4, 79], dtype=uint8)
Some more info about how these paths work can be found at:
https://matplotlib.org/stable/tutorials/advanced/path_tutorial.html
So long story short, you do have all the same keywords as when using Matplotlib, but they're the keywords for Path's and not the Line2D object that you might expect.
You can always flip the order around, and start with a Matplotlib figure/axes created by you, and pass that axes to Geopandas when you want to plot something. That might make it easier or more intuitive when you (also) want to plot other things in the same axes. It does require perhaps a bit more discipline to make sure the (spatial)coordinates etc match.
I personally almost always do that, because it allows to do most of the plotting using the same Matplotlib API's. Which admittedly has perhaps a slightly steeper learning curve. But overall I find it easier compared to having to deal with every package's slightly different interpretation that uses Matplotlib under the hood (eg geopandas, seaborn, xarray etc). But that really depends on where you're coming from.
Thank you for your detailed answer. Based on this I came up with this simplified code from my real project.
I have a shapefile shp and some point data df which I want to plot. shp is plotted with geopandas, df with matplotlib.plt. No need for transferring the point data into a geodataframe gdf as I did initially.
# read marker data (places with coordindates)
df = pd.read_csv("../obese_pct_by_place.csv")
gdf = gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df['sweref99_lng'], df['sweref99_lat']))
# read shapefile
shp = gpd.read_file("../../SWEREF_Shapefiles/KommunSweref99TM/Kommun_Sweref99TM_region.shp")
fig, ax = plt.subplots(figsize=(10, 8))
ax.set_aspect('equal')
shp.plot(ax=ax)
# plot obesity markers
# geopandas, no edgecolor here
# gdf.plot(ax=ax, marker='o', c='r', markersize=gdf['obese'] * 25)
# matplotlib.pyplot with edgecolor
plt.scatter(df['sweref99_lng'], df['sweref99_lat'], c='r', edgecolor='k', s=df['obese'] * 25)
plt.show()

define size of individual subplots side by side

I am using subplots side by side
plt.subplot(1, 2, 1)
# plot 1
plt.xlabel('MEM SET')
plt.ylabel('Memory Used')
plt.bar(inst_memory['MEMORY_SET_TYPE'], inst_memory['USED_MB'], alpha = 0.5, color = 'r')
# pol 2
plt.subplot(1, 2, 2)
plt.xlabel('MEM POOL')
plt.ylabel('Memory Used')
plt.bar(set_memory['POOL_TYPE'], set_memory['MEMORY_POOL_USED'], alpha = 0.5, color = 'g')
they have identical size - but is it possible to define the width for each subplot, so the right one could be wider as it has more entries and text would not squeeze or would it be possible to replace the bottom x-text by a number and have a legend with 1:means xx 2:means yyy
I find GridSpec helpful for subplot arrangements, see this demo at matplotlib.
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
import pandas as pd
N=24
inst_memory = pd.DataFrame({'MEMORY_SET_TYPE': np.random.randint(0,3,N),
'USED_MB': np.random.randint(0,1000,N)})
set_memory = pd.DataFrame({'MEMORY_POOL_USED': np.random.randint(0,1000,N),
'POOL_TYPE': np.random.randint(0,10,N)})
fig = plt.figure()
gs = GridSpec(1, 2, width_ratios=[1, 2],wspace=0.3)
ax1 = fig.add_subplot(gs[0])
ax2 = fig.add_subplot(gs[1])
ax1.bar(inst_memory['MEMORY_SET_TYPE'], inst_memory['USED_MB'], alpha = 0.5, color = 'r')
ax2.bar(set_memory['POOL_TYPE'], set_memory['MEMORY_POOL_USED'], alpha = 0.5, color = 'g')
You may need to adjust width_ratios and wspace to get the desired layout.
Also, rotating the text in x-axis might help, some info here.

Format the legend-title in a matplotlib ax.twiny() plot

Believe it or not I need help with formatting the title of the legend (not the title of the plot) in a simple plot. I am plotting two series of data (X1 and X2) against Y in a twiny() plot.
I call matplotlib.lines to construct lines for the legend and then call plt.legend to construct a legend pass text strings to name/explain the lines, format that text and place the legend. I could also pass a title-string to plt.legend but I cannot format it.
The closest I have come to a solution is to create another 'artist' for the title using .legend()set_title and then format the title text. I assign it to a variable and call the variable in the above mentioned plt.legend. This does not result in an error nor does it produce the desired effect. I have no control over the placement of the title.
I have read through a number of S-O postings and answers on legend-related issues, looked at the MPL docs, various tutorial type web-pages and even taken a peak at a GIT-hub issue (#10391). Presumably the answer to my question is somewhere in there but not in a format that I have been able to successfully implement.
#Imports
import matplotlib.pyplot as plt
import matplotlib.lines as mlines
import numpy as np
import seaborn as sns
plt.style.use('seaborn')
#Some made up data
y = np.arange(0, 1200, 100)
x1 = (np.log(y+1))
x2 = (2.2*x1)
#Plot figure
fig = plt.figure(figsize = (12, 14))
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twiny()
sy1, sy2 = 'b-', 'r-'
tp, bm = 0, 1100
red_ticks = np.arange(0, 11, 2)
ax1.plot(x1, y, sy1)
ax1.set_ylim(tp, bm)
ax1.set_xlim(0, 10)
ax1.set_ylabel('Distance (m)')
ax1.set_xlabel('Area')
ax1.set_xticks(red_ticks)
blue_ticks = np.arange(0, 22, 4)
ax2.plot(x2, y, sy2)
ax2.set_xlim(0, 20)
ax2.set_xlabel('Volume')
ax2.set_xticks(blue_ticks)
ax2.grid(False)
x1_line = mlines.Line2D([], [], color='blue')
x2_line = mlines.Line2D([], [], color='red')
leg = ax1.legend().set_title('Format Legend Title ?',
prop = {'size': 'large',
'family':'serif',
'style':'italic'})
plt.legend([x1_line, x2_line], ['Blue Model', 'Red Model'],
title = leg,
prop ={'size':12,
'family':'serif',
'style':'italic'},
bbox_to_anchor = (.32, .92))
So what I want is a simple way to control the formatting of both the legend-title and legend-text in a single artist, and also have control over the placement of said legend.
The above code returns a "No handles with labels found to put in legend."
You need one single legend. You can set the title of that legend (not some other legend); then style it to your liking.
leg = ax2.legend([x1_line, x2_line], ['Blue Model', 'Red Model'],
prop ={'size':12, 'family':'serif', 'style':'italic'},
bbox_to_anchor = (.32, .92))
leg.set_title('Format Legend Title ?', prop = {'size': 24, 'family':'sans-serif'})
Unrelated, but also important: Note that you have two figures in your code. You should remove one of them.

matplotlib line plot dont show vertical lines in step function

I do have a plot that only consists of horizontal lines at certain values when I have a signal, otherwise none. So, I am looking for a way to plot this without the vertical lines. there may be gaps between the lines when there is no signal and I dont want the lines to connect nor do I want a line falling off to 0. Is there a way to plot this like that in matplotlib?
self.figure = plt.figure()
self.canvas = FigureCanvas(self.figure)
axes = self.figure.add_subplot(111)
axes.plot(df.index, df["x1"], lw=1.0, c=self.getColour('g', i), ls=ls)
The plot you are looking for is Matplotlib's plt.hlines(y, xmin, xmax).
For example:
import matplotlib.pyplot as plt
y = range(1, 11)
xmin = range(10)
xmax = range(1, 11)
colors=['blue', 'green', 'red', 'yellow', 'orange', 'purple',
'cyan', 'magenta', 'pink', 'black']
fig, ax = plt.subplots(1, 1)
ax.hlines(y, xmin, xmax, colors=colors)
plt.show()
Yields a plot like this:
See the Matplotlib documentation for more details.

"panel barchart" in matplotlib

I would like to produce a figure like this one using matplotlib:
(source: peltiertech.com)
My data are in a pandas DataFrame, and I've gotten as far as a regular stacked barchart, but I can't figure out how to do the part where each category is given its own y-axis baseline.
Ideally I would like the vertical scale to be exactly the same for all the subplots and move the panel labels off to the side so there can be no gaps between the rows.
I haven't exactly replicated what you want but this should get you pretty close.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
#create dummy data
cols = ['col'+str(i) for i in range(10)]
ind = ['ind'+str(i) for i in range(10)]
df = pd.DataFrame(np.random.normal(loc=10, scale=5, size=(10, 10)), index=ind, columns=cols)
#create plot
sns.set_style("whitegrid")
axs = df.plot(kind='bar', subplots=True, sharey=True,
figsize=(6, 5), legend=False, yticks=[],
grid=False, ylim=(0, 14), edgecolor='none',
fontsize=14, color=[sns.xkcd_rgb["brownish red"]])
plt.text(-1, 100, "The y-axis label", fontsize=14, rotation=90) # add a y-label with custom positioning
sns.despine(left=True) # get rid of the axes
for ax in axs: # set the names beside the axes
ax.lines[0].set_visible(False) # remove ugly dashed line
ax.set_title('')
sername = ax.get_legend_handles_labels()[1][0]
ax.text(9.8, 5, sername, fontsize=14)
plt.suptitle("My panel chart", fontsize=18)