Polishing Scatter Plot Legends With Seaborn - pandas

The scatter plot generated for the piece of code below using seaborn is as follows.
ax = sns.scatterplot(x="Param_1",
y="Param_2",
hue="Process", style='Item', data=df,
s=30, legend='full')
I wanted to get rid of color legends (for process) in circle as circles also denote data for Item 'One'. What would be the best way to present the colors legends for Process without making a discrepancy with shapes used for Item.

You can create so-called proxy artists and use them as legend symbols.
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
fig,(ax1,ax2) = plt.subplots(ncols=2)
tips = sns.load_dataset("tips")
hue = "day"
style = "time"
sns.scatterplot(x="total_bill", y="tip", hue=hue, style=style, data=tips, ax=ax1)
ax1.set_title("Default Legend")
sns.scatterplot(x="total_bill", y="tip", hue=hue, style=style, data=tips, ax=ax2)
ax2.set_title("Custom Legend")
handles, labels = ax2.get_legend_handles_labels()
for i,label in enumerate(labels):
if label == hue:
continue
if label == style:
break
handles[i] = mpatches.Patch(color=handles[i].get_fc()[0])
ax2.legend(handles, labels)

Related

Plot multiple mplfinance plots sharing x axis

I am trying to plot 5 charts one under the other with mplfinance.
This works:
for coin in coins:
mpf.plot(df_coins[coin], title=coin, type='line', volume=True, show_nontrading=True)
However each plot is a separate image in my Python Notebook cell output. And the x-axis labelling is repeated for each image.
I try to make a single figure containing multiple subplot/axis, and plot one chart into each axis:
from matplotlib import pyplot as plt
N = len(df_coins)
fig, axes = plt.subplots(N, figsize=(20, 5*N), sharex=True)
for i, ((coin, df), ax) in zip(enumerate(df_coins.items()), axes):
mpf.plot(df, ax=ax, title=coin, type='line', volume=True, show_nontrading=True)
This displays subfigures of the correct dimensions, however they are not getting populated with data. Axes are labelled from 0.0 to 1.0 and the title is not appearing.
What am I missing?
There are two ways to subplot. One is to set up a figure with mplfinance objects. The other way is to use your adopted matplotlib subplot to place it.
yfinace data
import matplotlib.pyplot as plt
import mplfinance as mpf
import yfinance as yf
tickers = ['AAPL','GOOG','TSLA']
data = yf.download(tickers, start="2021-01-01", end="2021-03-01", group_by='ticker')
aapl = data[('AAPL',)]
goog = data[('GOOG',)]
tsla = data[('TSLA',)]
mplfinance
fig = mpf.figure(style='yahoo', figsize=(12,9))
#fig.subplots_adjust(hspace=0.3)
ax1 = fig.add_subplot(3,1,1, sharex=ax3)
ax2 = fig.add_subplot(3,1,2, sharex=ax3)
ax3 = fig.add_subplot(3,1,3)
mpf.plot(aapl, type='line', ax=ax1, axtitle='AAPL', xrotation=0)
mpf.plot(goog, type='line', ax=ax2, axtitle='GOOG', xrotation=0)
mpf.plot(tsla, type='line', ax=ax3, axtitle='TSLA', xrotation=0)
ax1.set_xticklabels([])
ax2.set_xticklabels([])
matplotlib
N = len(tickers)
fig, axes = plt.subplots(N, figsize=(20, 5*N), sharex=True)
for df,t,ax in zip([aapl,goog,tsla], tickers, axes):
mpf.plot(df, ax=ax, axtitle=t, type='line', show_nontrading=True)# volume=True
In addition to the techniques mentioned by #r-beginners there is another technique that may work for you in the case where all plots share the same x-axis. That is to use mpf.make_addplot().
aps = []
for coin in coins[1:]:
aps.append(mpf.make_addplot(df_coins[coin]['Close'], title=coin, type='line'))
coin = coins[0]
mpf.plot(df_coins[coin],axtitle=coin,type='line',volume=True,show_nontrading=True,addplot=aps)
If you choose to do type='candle' instead of 'line', then change
df_coins[coin]['Close']
to simply
df_coins[coin]

How to bold the pandas boxplot

This is my boxplot. When putting in the paper, the boxplot lines look very thin. I tried whiskers but it only bold part of lines. Do you know how to bold all lines in the boxplot? Thank you.
I don't know which graph library you are using, but matplotlib allows you to use the one defined in boxprops. See the official reference.
import numpy as np
import matplotlib.pyplot as plt
# fake data
np.random.seed(100)
data = np.random.lognormal(size=(37, 4), mean=1.5, sigma=1.75)
labels = list('ABCD')
fs = 10 # fontsize
boxprops = dict(linestyle='-', linewidth=3, color='k')
whiskerprops = dict(linestyle='-', linewidth=3, color='k')
capprops = dict(linestyle='-', linewidth=3, color='k')
fig, ax = plt.subplots(1,1, figsize=(6, 6))
ax.boxplot(data, boxprops=boxprops, whiskerprops=whiskerprops, capprops=capprops)
ax.set_title('Custom boxprops', fontsize=fs)
plt.show()

how to customize color legend when using for loop in matplotlib, scatter

I want to draw a 3D scatter, in which the data is colored by group. Here is the data sample:
aa=pd.DataFrame({'a':[1,2,3,4,5],
'b':[2,3,4,5,6],
'c':[1,3,4,6,9],
'd':[0,0,1,2,3],
'e':['abc','sdf','ert','hgf','nhkm']})
Here, a, b, c are axis x, y, z. e is the text shown in the scatter. I need d to group the data and show different colors.
Here is my code:
fig = plt.figure()
ax = fig.gca(projection='3d')
zdirs = aa.loc[:,'e'].__array__()
xs = aa.loc[:,'a'].__array__()
ys = aa.loc[:,'b'].__array__()
zs = aa.loc[:,'c'].__array__()
colors = aa.loc[:,'d'].__array__()
colors1=np.where(colors==0,'grey',
np.where(colors==1,'yellow',
np.where(colors==2,'green',
np.where(colors==3,'pink','red'))))
for i in range(len(zdirs)): #plot each point + it's index as text above
ax.scatter(xs[i],ys[i],zs[i],color=colors1[i])
ax.text(xs[i],ys[i],zs[i], '%s' % (str(zdirs[i])), size=10, zorder=1, color='k')
ax.set_xlabel('a')
ax.set_ylabel('b')
ax.set_zlabel('c')
plt.show()
But I do not know how to put a legend on the plot. I hope my legend is like:
The colors and the numbers should match and be ordered.
Could anyone help me with how to customize the color bar?
First of all, I've taken the liberty to reduce your code a bit:
I'd suggest to create a ListedColormap to map integer->color, which allows you to pass the color column via c=aa['d'] (note it's c=, not color=!)
you don't need to use __array__() here, in the code below you can directly use aa['a']
finally, you can add an empty scatter plot for each color in the ListedColormap, and this can then be rendered correctly by ax.legend()
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
from matplotlib.colors import ListedColormap
import matplotlib.patches as mpatches
aa=pd.DataFrame({'a':[1,2,3,4,5],
'b':[2,3,4,5,6],
'c':[1,3,4,6,9],
'd':[0,0,1,2,3],
'e':['abc','sdf','ert','hgf','nhkm']})
fig = plt.figure()
ax = fig.gca(projection='3d')
cmap = ListedColormap(['grey', 'yellow', 'green', 'pink','red'])
ax.scatter(aa['a'],aa['b'],aa['c'],c=aa['d'],cmap=cmap)
for x,y,z,label in zip(aa['a'],aa['b'],aa['c'],aa['e']):
ax.text(x,y,z,label,size=10,zorder=1)
# Create a legend through an *empty* scatter plot
[ax.scatter([], [], c=cmap(i), label=str(i)) for i in range(len(aa))]
ax.legend()
ax.set_xlabel('a')
ax.set_ylabel('b')
ax.set_zlabel('c')
plt.show()

matplotlib: shorten a colorbar by half when the colorbar is created using axes_grid1

I am trying to shorten a colorbar by half. Does anyone know how to do this? I tried cax.get_position() and then cax.set_position(), but this method did not work.
Besides, it seems that axes created by axes_grid1 has the same bbox positions as the original axes. Is this a bug?
PS. I have to use axes_grid1 to create colorbar axes, because I need to use tight_layout() afterwards, and tight_layout() only applies to axes created by axes_grid1 but not ones created by add_axes().
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import make_axes_locatable
import numpy as np
plt.figure()
ax = plt.gca()
im = ax.imshow(np.arange(100).reshape((10,10)))
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
bbox1 = ax.get_position()
print(bbox1)
bbox1 = cax.get_position()
print(bbox1)
plt.colorbar(im, cax=cax)
plt.show()
The whole point of the axes_divider is to divide the axes to make space for a new axes. This ensures that all axes have the same surrounding box. And that is the box you see being printed.
Some of the usual ways to create a colorbar, at a certain location in the figue are shown in this question. Here the problem seems to be to be able to call tight_layout. This is achievable with the following two options. (There might be others still.)
A. using gridspec
I'm not too sure about the exact requirements here, but it seems that using a normal grid layout would be more in the direction of what you need here.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
fig = plt.figure()
gs = gridspec.GridSpec(2, 2, width_ratios=[95,5],)
ax = fig.add_subplot(gs[:, 0])
im = ax.imshow(np.arange(100).reshape((10,10)))
cax = fig.add_subplot(gs[1, 1])
fig.colorbar(im, cax=cax, ax=ax)
plt.tight_layout()
plt.show()
B. Using axes_grid1
If you really need to use axes_grid1, it might become a little bit more complicated.
import matplotlib.pyplot as plt
import matplotlib.axes
from mpl_toolkits.axes_grid1 import make_axes_locatable, Size
import numpy as np
fig, ax = plt.subplots()
im = ax.imshow(np.arange(100).reshape((10,10)))
divider = make_axes_locatable(ax)
pad = 0.03
pad_size = Size.Fraction(pad, Size.AxesY(ax))
xsize = Size.Fraction(0.05, Size.AxesX(ax))
ysize = Size.Fraction(0.5-pad/2., Size.AxesY(ax))
divider.set_horizontal([Size.AxesX(ax), pad_size, xsize])
divider.set_vertical([ysize, pad_size, ysize])
ax.set_axes_locator(divider.new_locator(0, 0, ny1=-1))
cax = matplotlib.axes.Axes(ax.get_figure(),
ax.get_position(original=True))
locator = divider.new_locator(nx=2, ny=0)
cax.set_axes_locator(locator)
fig.add_axes(cax)
fig.colorbar(im, cax=cax)
plt.tight_layout()
plt.show()

Remove the legend on a matplotlib figure

To add a legend to a matplotlib plot, one simply runs legend().
How to remove a legend from a plot?
(The closest I came to this is to run legend([]) in order to empty the legend from data. But that leaves an ugly white rectangle in the upper right corner.)
As of matplotlib v1.4.0rc4, a remove method has been added to the legend object.
Usage:
ax.get_legend().remove()
or
legend = ax.legend(...)
...
legend.remove()
See here for the commit where this was introduced.
If you want to plot a Pandas dataframe and want to remove the legend, add legend=None as parameter to the plot command.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df2 = pd.DataFrame(np.random.randn(10, 5))
df2.plot(legend=None)
plt.show()
You could use the legend's set_visible method:
ax.legend().set_visible(False)
draw()
This is based on a answer provided to me in response to a similar question I had some time ago here
(Thanks for that answer Jouni - I'm sorry I was unable to mark the question as answered... perhaps someone who has the authority can do so for me?)
if you call pyplot as plt
frameon=False is to remove the border around the legend
and '' is passing the information that no variable should be in the legend
import matplotlib.pyplot as plt
plt.legend('',frameon=False)
you have to add the following lines of code:
ax = gca()
ax.legend_ = None
draw()
gca() returns the current axes handle, and has that property legend_
According to the information from #naitsirhc, I wanted to find the official API documentation. Here are my finding and some sample code.
I created a matplotlib.Axes object by seaborn.scatterplot().
The ax.get_legend() will return a matplotlib.legned.Legend instance.
Finally, you call .remove() function to remove the legend from your plot.
ax = sns.scatterplot(......)
_lg = ax.get_legend()
_lg.remove()
If you check the matplotlib.legned.Legend API document, you won't see the .remove() function.
The reason is that the matplotlib.legned.Legend inherited the matplotlib.artist.Artist. Therefore, when you call ax.get_legend().remove() that basically call matplotlib.artist.Artist.remove().
In the end, you could even simplify the code into two lines.
ax = sns.scatterplot(......)
ax.get_legend().remove()
If you are not using fig and ax plot objects you can do it like so:
import matplotlib.pyplot as plt
# do plot specifics
plt.legend('')
plt.show()
Here is a more complex example of legend removal and manipulation with matplotlib and seaborn dealing with subplots:
From seaborn, get the Axes object created by sns.<some_plot>() and do ax.get_legend().remove() as indicated by #naitsirhc. The following example also shows how to put the legend aside, and how to deal in a context of subplots.
# imports
import seaborn as sns
import matplotlib.pyplot as plt
# get data
sns.set()
sns.set_theme(style="darkgrid")
tips = sns.load_dataset("tips")
# subplots
fig, axes = plt.subplots(1, 2, sharex=True, sharey=True, figsize=(12,6))
fig.suptitle('Example of legend manipulations on subplots with seaborn')
g0 = sns.pointplot(ax=axes[0], data=tips, x="day", y="total_bill", hue="size")
g0.set(title="Pointplot with no legend")
g0.get_legend().remove() # <<< REMOVE LEGEND HERE
g1 = sns.swarmplot(ax=axes[1], data=tips, x="day", y="total_bill", hue="size")
g1.set(title="Swarmplot with legend aside")
# change legend position: https://www.statology.org/seaborn-legend-position/
g1.legend(bbox_to_anchor=(1.02, 1), loc='upper left', borderaxespad=0)
I made a legend by adding it to the figure, not to an axis (matplotlib 2.2.2). To remove it, I set the legends attribute of the figure to an empty list:
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twinx()
ax1.plot(range(10), range(10, 20), label='line 1')
ax2.plot(range(10), range(30, 20, -1), label='line 2')
fig.legend()
fig.legends = []
plt.show()
If you are using seaborn you can use the parameter legend. Even if you are ploting more than once in the same figure. Example with some df
import seaborn as sns
# Will display legend
ax1 = sns.lineplot(x='cars', y='miles', hue='brand', data=df)
# No legend displayed
ax2 = sns.lineplot(x='cars', y='miles', hue='brand', data=df, legend=None)
you could simply do:
axs[n].legend(loc='upper left',ncol=2,labelspacing=0.01)
for i in [4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]:
axs[i].legend([])