I am trying to create a seaborn boxplot and overlay with individual data points using seaborn swarmplot for a dataset that has two categorical variables (Nameplate Capacity and Scenario) and one continuous variable (ELCC values). Since I have two overlaying plots in seaborn, it is generating two legends for the same variables being plotted. How do I plot a box plot along with a swarm plot while only showing the legend from the box plot. My current code looks like:
plt.subplots(figsize=(25,18))
sns.set_theme(style = "whitegrid", font_scale= 1.5 )
ax = sns.boxplot(x="Scenario", y="ELCC", hue = "Nameplate Capacity",
data=final_offshore, palette = "Pastel1")
ax = sns.swarmplot(x="Scenario", y="ELCC", hue = "Nameplate Capacity", dodge=True, marker='D', size =9, alpha=0.35, data=final_offshore, color="black")
plt.xlabel('Scenarios')
plt.ylabel('ELCC values')
plt.title('Contribution of ad-hoc offshore generator in each scenario')
My plot so far:
You can draw your box plot, get that legend, draw the swarm plot and then re-draw the legend:
# Draw the bar chart
ax = sns.boxplot(
x="Scenario",
y="ELCC",
hue="Nameplate Capacity",
data=final_offshore,
palette="Pastel1",
)
# Get the legend from just the box plot
handles, labels = ax.get_legend_handles_labels()
# Draw the swarmplot
sns.swarmplot(
x="Scenario",
y="ELCC",
hue="Nameplate Capacity",
dodge=True,
marker="D",
size=9,
alpha=0.35,
data=final_offshore,
color="black",
ax=ax,
)
# Remove the old legend
ax.legend_.remove()
# Add just the handles/labels from the box plot back
ax.legend(
handles,
labels,
loc=0,
)
Related
I am using matplotlib to graph 3 time series. I want all 3 y-axes to be plotted on the right side of the graph. However I am unable to get one of the y-axis values to plot on the right side, only it's axis label.
Here is my code:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
import datetime as dt
import matplotlib.dates as mdates
from matplotlib.ticker import ScalarFormatter, FormatStrFormatter
from matplotlib import ticker
# Set date range for graph
df_ratios=df_ratios.loc['2017-06':'2021-12']
#-------------------------------------------------------------------
#For all Matplotlib plots, we start by creating a figure and an axes.
#-------------------------------------------------------------------
# subplot for plotting figure
fig, ax = plt.subplots()
#Similar to fig=plt.figure(figsize=(12,6))
fig.set_figwidth(12)
fig.set_figheight(6)
# Graph title
#############
fig.suptitle('Inventory-to-Sales Ratios by Industry',fontsize=20)
# LABELS: Set x label which is common / set left y-axis label / set labelcolor and labelsize to the left Y-axis
###############################################################################################################
ax.set_xlabel('Monthly: Jun 2017 - Dec. 2021')
ax.set_ylabel('Manufacturers I/S Ratio', color='red',size='x-large')
ax.tick_params(axis='y', labelcolor='red', labelsize='large')
ax.spines["right"].set_visible(True)
# Left Y-Axis: Mfg IS ratio on left Y-axis
###########################################
ax.plot(df_ratios.index, df_ratios['mfg_is_ratio'], color='red',linewidth=5.0)
ax.set_ylim(1.25,1.8)
ax.yaxis.set_major_locator(ticker.MultipleLocator(0.10))
ax.yaxis.set_label_position('right')
ax.yaxis.set_ticks_position('right')
ax.spines["right"].set_position(("axes", 1.0)) # Set second Y-axis 10% away from existing Y axis
# RIGHT Y-Axis labels: twinx sets the same x-axis for both plots / set right y-axis label / set labelcolor and labelsize to the right Y-axis
############################################################################################################################################
ax_1 = ax.twinx()
ax_1.set_ylabel('Wholesalers I/S Ratio', color='blue', size='x-large')
ax_1.tick_params(axis='y', labelcolor='blue',labelsize='large')
# FIRST Right Y-Axis plot: Wholesale IS ratio
#############################################
ax_1.plot(df_ratios.index, df_ratios['whole_is_ratio'], color='blue',linewidth=5.0)
ax_1.spines["right"].set_position(("axes", 1.08)) # Set second Y-axis 10% away from existing Y axis
ax_1.set_ylim(1.15,1.75)
ax_1.yaxis.set_major_locator(ticker.MultipleLocator(0.10))
# SECOND Right Y-Axis: Sum of Mfg+Wholesale ratios
##################################################
ax_2=ax.twinx()
ax_2.set_ylabel('Wholesalers Inventories', color='green', size='x-large')
ax_2.spines["right"].set_position(("axes", 1.18)) # Set second Y-axis 10% away from existing Y axis
ax_2.set_ylim(2.60,3.25)
ax_2.plot(df_ratios.index, df_ratios['totals'], color='green',linewidth=5.0)
ax_2.tick_params(axis='y', labelcolor='green',labelsize='large')
# Show graph:
#############
plt.show()
Here is the result:
How do I get the red y-axis values (manufacturers i/s ratio) to plot on the right side of the graph?
The problem is, when twinx() is used.
From the source code documentation:
Create a new Axes with an invisible x-axis and an independent y-axis
positioned opposite to the original one (i.e. at right).
To reverse this, just use ax.yaxis.tick_right() after all twinx() calls.
For example right before plt.show().
Red ticks should be now placed on the right side.
I have 3 different dataframes and they all have a 'price' column. I want to have 3 different price to index scatter plots, but I don't want to have them on 3 different cells. I want them to be just in one cell, next to one another.
You should be able to do that in a single cell in Jupyter notebook.
This is probably not the most elegant way to do it, but you'll just need to structure your code so that you draw out each plot in order. e.g. create subplot1, add ticks, labels, etc... plt.show() it, then do the same for all the subsequent plots.
For example:
from matplotlib import pyplot as plt
# First plot
ax = plt.subplot()
plt.scatter( ... ) # Scatter plot 1 data
plt.title( ... )
plt.show()
# Second plot
ax = plt.subplot()
plt.scatter( ... ) # Scatter plot 2 data
plt.title( ... )
plt.show()
# Third plot
# Rinse and repeat
How would you go about changing the style of only some boxes in a matplotlib boxplot? Below, you can see an example of styling, but I would like the style to only apply to one of the boxes.
The same question has been asked for seaborn boxplots already. For matplotlib boxplots this is even easier, since the boxplot directly returns a dictionary of the involved artists, see boxplot documentation.
This means that if bplot = ax.boxplot(..) is your boxplot, you may access the boxes via bplot['boxes'], select one of them and set its linestyle to your desire. E.g.
bplot['boxes'][2].set_linestyle("-.")
Modifying the boxplot_color example
import matplotlib.pyplot as plt
import numpy as np
# Random test data
np.random.seed(19680801)
all_data = [np.random.normal(0, std, size=100) for std in range(1, 4)]
labels = ['x1', 'x2', 'x3']
fig, ax = plt.subplots()
# notch shape box plot
bplot = ax.boxplot(all_data, vert=True, patch_artist=True, labels=labels)
# Loop through boxes and colorize them individually
colors = ['pink', 'lightblue', 'lightgreen']
for patch, color in zip(bplot['boxes'], colors):
patch.set_facecolor(color)
# Make the third box dotted
bplot['boxes'][2].set_linestyle("-.")
plt.show()
Searching easily reveals how to plot multiple charts on one figure, whether using the same plotting axes, a second y axis or subplots. Much harder to uncover is how to overlay one figure onto another, something like this:
That image was prepared using a bitmap editor to overlay the images. I have no difficulty creating the individual plots, but cannot figure out how to combine them. I expect a single line of code will suffice, but what is it? Here is how I imagine it:
bigFig = plt.figure(1, figsize=[5,25])
...
ltlFig = plt.figure(2)
...
bigFig.overlay(ltlFig, pos=[x,y], size=[1,1])
I've established that I can use figure.add_axes, but it is quite challenging getting the position of the overlaid plot correct, since the parameters are fractions, not x,y values from the first plot. It also [it seems to me - am I wrong?] places constraints on the order in which the charts are plotted, since the main plot must be completed before the other plots are added in turn.
What is the pyplot method that achieves this?
To create an inset axes you may use mpl_toolkits.axes_grid1.inset_locator.inset_axes.
Position of inset axes in axes coordinates
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1.inset_locator import inset_axes
fig, ax= plt.subplots()
inset_axes = inset_axes(ax,
width=1, # inch
height=1, # inch
bbox_transform=ax.transAxes, # relative axes coordinates
bbox_to_anchor=(0.5,0.5), # relative axes coordinates
loc=3) # loc=lower left corner
ax.axis([0,500,-.1,.1])
plt.show()
Position of inset axes in data coordinates
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1.inset_locator import inset_axes
fig, ax= plt.subplots()
inset_axes = inset_axes(ax,
width=1, # inch
height=1, # inch
bbox_transform=ax.transData, # data coordinates
bbox_to_anchor=(250,0.0), # data coordinates
loc=3) # loc=lower left corner
ax.axis([0,500,-.1,.1])
plt.show()
Both of the above produce the same plot
(For a possible drawback of this solution see specific location for inset axes)
I've drawn a plot that looks something like the following:
It was created using the following code:
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
# 1. Plot a figure consisting of 3 separate axes
# ==============================================
plotNames = ['Plot1','Plot2','Plot3']
figure, axisList = plt.subplots(len(plotNames), sharex=True, sharey=True)
tempDF = pd.DataFrame()
tempDF['date'] = pd.date_range('2015-01-01','2015-12-31',freq='D')
tempDF['value'] = np.random.randn(tempDF['date'].size)
tempDF['value2'] = np.random.randn(tempDF['date'].size)
for i in range(len(plotNames)):
axisList[i].plot_date(tempDF['date'],tempDF['value'],'b-',xdate=True)
# 2. Create a new single axis in the figure. This new axis sits over
# the top of the axes drawn previously. Make all the components of
# the new single axis invisibe except for the x and y labels.
big_ax = figure.add_subplot(111)
big_ax.set_axis_bgcolor('none')
big_ax.set_xlabel('Date',fontweight='bold')
big_ax.set_ylabel('Random normal',fontweight='bold')
big_ax.tick_params(labelcolor='none', top='off', bottom='off', left='off', right='off')
big_ax.spines['right'].set_visible(False)
big_ax.spines['top'].set_visible(False)
big_ax.spines['left'].set_visible(False)
big_ax.spines['bottom'].set_visible(False)
# 3. Plot a separate figure
# =========================
figure2,ax2 = plt.subplots()
ax2.plot_date(tempDF['date'],tempDF['value2'],'-',xdate=True,color='green')
ax2.set_xlabel('Date',fontweight='bold')
ax2.set_ylabel('Random normal',fontweight='bold')
# Save plot
# =========
plt.savefig('tempPlot.png',dpi=300)
Basically, the rationale for plotting the whole picture is as follows:
Create the first figure and plot 3 separate axes using a loop
Plot a single axis in the same figure to sit on top of the graphs
drawn previously. Label the x and y axes. Make all other aspects of
this axis invisible.
Create a second figure and plot data on a single axis.
The plot displays just as I want when using jupyter-notebook but when the plot is saved, the file contains only the second figure.
I was under the impression that plots could have multiple figures and that figures could have multiple axes. However, I suspect I have a fundamental misunderstanding of the differences between plots, subplots, figures and axes. Can someone please explain what I'm doing wrong and explain how to get the whole image to save to a single file.
Matplotlib does not have "plots". In that sense,
plots are figures
subplots are axes
During runtime of a script you can have as many figures as you wish. Calling plt.save() will save the currently active figure, i.e. the figure you would get by calling plt.gcf().
You can save any other figure either by providing a figure number num:
plt.figure(num)
plt.savefig("output.png")
or by having a refence to the figure object fig1
fig1.savefig("output.png")
In order to save several figures into one file, one could go the way detailed here: Python saving multiple figures into one PDF file.
Another option would be not to create several figures, but a single one, using subplots,
fig = plt.figure()
ax = plt.add_subplot(611)
ax2 = plt.add_subplot(612)
ax3 = plt.add_subplot(613)
ax4 = plt.add_subplot(212)
and then plot the respective graphs to those axes using
ax.plot(x,y)
or in the case of a pandas dataframe df
df.plot(x="column1", y="column2", ax=ax)
This second option can of course be generalized to arbitrary axes positions using subplots on grids. This is detailed in the matplotlib user's guide Customizing Location of Subplot Using GridSpec
Furthermore, it is possible to position an axes (a subplot so to speak) at any position in the figure using fig.add_axes([left, bottom, width, height]) (where left, bottom, width, height are in figure coordinates, ranging from 0 to 1).