Customizing subplots in matplotlib - matplotlib

I want to place 3 plots using subplots. Two plots on the top row and one plot that will occupy the entire second row.
My code creates a gap between the top two plots and the lower plot. How can I correct that?
df_CI
Country China India
1980 5123 8880
1981 6682 8670
1982 3308 8147
1983 1863 7338
1984 1527 5704
fig = plt.figure() # create figure
ax0 = fig.add_subplot(221) # add subplot 1 (2 row, 2 columns, first plot)
ax1 = fig.add_subplot(222) # add subplot 2 (2 row, 2 columns, second plot).
ax2 = fig.add_subplot(313) # a 3 digit number where the hundreds represent nrows, the tens represent ncols
# and the units represent plot_number.
# Subplot 1: Box plot
df_CI.plot(kind='box', color='blue', vert=False, figsize=(20, 20), ax=ax0) # add to subplot 1
ax0.set_title('Box Plots of Immigrants from China and India (1980 - 2013)')
ax0.set_xlabel('Number of Immigrants')
ax0.set_ylabel('Countries')
# Subplot 2: Line plot
df_CI.plot(kind='line', figsize=(20, 20), ax=ax1) # add to subplot 2
ax1.set_title ('Line Plots of Immigrants from China and India (1980 - 2013)')
ax1.set_ylabel('Number of Immigrants')
ax1.set_xlabel('Years')
# Subplot 3: Box plot
df_CI.plot(kind='bar', figsize=(20, 20), ax=ax2) # add to subplot 1
ax0.set_title('Box Plots of Immigrants from China and India (1980 - 2013)')
ax0.set_xlabel('Number of Immigrants')
ax0.set_ylabel('Countries')
plt.show()

I've always found subplots syntax a little difficult.
With these calls
ax0 = fig.add_subplot(221)
ax1 = fig.add_subplot(222)
you're dividing your figure in a 2x2 grid and filling the first row.
ax2 = fig.add_subplot(313)
Now you're dividing it in three rows and filling the last one.
You're basically creating two independent subplot grids, there is no easy way to define how to space subplots from one with respect to the other.
A much easier and pythonic way is using gridspec to create a single finer grid and address it with python slicing.
fig = plt.figure()
gs = mpl.gridspec.GridSpec(2, 2, wspace=0.25, hspace=0.25) # 2x2 grid
ax0 = fig.add_subplot(gs[0, 0]) # first row, first col
ax1 = fig.add_subplot(gs[0, 1]) # first row, second col
ax2 = fig.add_subplot(gs[1, :]) # full second row
And now you can also easily tune spacing with wspace and hspace.
More complex layouts are also a lot easier, it's just the familiar slicing syntax.
fig = plt.figure()
gs = mpl.gridspec.GridSpec(10, 10, wspace=0.25, hspace=0.25)
fig.add_subplot(gs[2:8, 2:8])
fig.add_subplot(gs[0, :])
for i in range(5):
fig.add_subplot(gs[1, (i*2):(i*2+2)])
fig.add_subplot(gs[2:, :2])
fig.add_subplot(gs[8:, 2:4])
fig.add_subplot(gs[8:, 4:9])
fig.add_subplot(gs[2:8, 8])
fig.add_subplot(gs[2:, 9])
fig.add_subplot(gs[3:6, 3:6])
# fancy colors
cmap = mpl.cm.get_cmap("viridis")
naxes = len(fig.axes)
for i, ax in enumerate(fig.axes):
ax.set_xticks([])
ax.set_yticks([])
ax.set_facecolor(cmap(float(i)/(naxes-1)))

Related

Cumulative histogram plot from dataframe

The goal is to create a plot like this
Dummy df:
columns = ['number_of_words', 'occurrences']
data = [[1, 2312252],
[2,1000000],
[3,800000],
[4, 400000],
[5, 100000],
[6, 70000],
[7, 40000],
[8, 10000],
[9, 4000],
[10, 50]]
dummy_df = pd.DataFrame(columns=columns, data=data)
The y axis represents the occurrences and the x axis the number of words column from the dummy_df.
The x axis should be cumulative such that it stacks the values on top of each other.
Example: With number_of_words = 1 we have around 2.3 m occurrences. With number_of_words = 2 we have around 1m occurrences, thus it should plot 2.3m + 1m at occurrences = 2.
At the final entry of number_of_words the histogram should reach sum(occurrences).
I do NOT want to normalize it.
Since you already got the frequencies worked out, just add it cumulatively:
dummy_df['acc'] = dummy_df.occurrences.cumsum()
ax = dummy_df['acc'].plot('bar', width=1, color='b')
dummy_df['acc'].shift().plot('bar', alpha=0.7, width=1, color='r', ax=ax)
To split it into parts, plot it twice. The first is the normal cumsum, then second is just the values, with the shifted cumsum setting the bottom (This overlaps the top of the previous plotted cumsum).
Using .iloc[1:] to slice the Series just before plotting removes the first bar, which you want to exclude.
fig, ax = plt.subplots()
df['occurrences'].cumsum().iloc[1:].plot(kind='bar', width=1, ec='k', ax=ax)
df['occurrences'].iloc[1:].plot(kind='bar', width=1, ec='k',
bottom=df['occurrences'].cumsum().shift().fillna(0).iloc[1:], ax=ax, color='red')
plt.show()

How to determine the matplotlib legend?

I have 3 lists to plot as curves. But every time I run the same plt lines, even with the ax.legend(loc='lower right', handles=[line1, line2, line3]), these 3 lists jumps randomly in the legend like below. Is it possible to fix their sequences and the colors for the legend as well as the curves in the plot?
EDIT:
My code is as below:
def plot_with_fixed_list(n, **kwargs):
np.random.seed(0)
fig, ax1 = plt.subplots()
my_handles = []
for key, values in kwargs.items():
value_name = key
temp, = ax1.plot(np.arange(1, n+ 1, 1).tolist(), values, label=value_name)
my_handles.append(temp)
ax1.legend(loc='lower right', handles=my_handles)
ax1.grid(True, which='both')
plt.show()
plot_with_fixed_list(300, FA_Hybrid=fa, BP=bp, Ssym_Hybrid=ssym)
This nondeterminism bug resides with python==3.5, matplotlib==3.0.0. After I updated to python==3.6, matplotlib==3.3.2, problem solved.

how to put 3 seaborn scatter plots under one another?

I want to combine all 3 seaborn scatter plots under one "frame".
plt.figure(figsize=(7,15))
plt.subplots(3,1)
sns.scatterplot(x=train['Garage Area'], y=train['SalePrice'])
plt.show()
sns.scatterplot(x=train['Gr Liv Area'], y=train['SalePrice'])
plt.show()
sns.scatterplot(x=train['Overall Cond'], y=train['SalePrice'])
plt.show()
But it creates 5, the first 3 are small according to (7,15) size but the last 2 are different.
I suspect it should be
plt.figure(figsize=(7,15))
fig,ax = plt.subplots(3,1)
ax[0] = fig.add_subplot(sns.scatterplot(x=train['Garage Area'], y=train['SalePrice']))
#plt.show()
ax[1] = fig.add_subplot(sns.scatterplot(x=train['Gr Liv Area'], y=train['SalePrice']))
#plt.show()
ax[2] =fig.add_subplot(sns.scatterplot(x=train['Overall Cond'], y=train['SalePrice']))
plt.show()
but all 3 plots are stuck in the last 3rd chart!
The following is one way to do it:
Create a figure with 3 subplots (3 rows, 1 column)
Pass the respective subplot using ax[0], ax[1] and ax[2] to the three separate sns.scatterplot commands using the keyword ax
fig, ax = plt.subplots(3, 1, figsize=(7,15))
sns.scatterplot(x=train['Garage Area'], y=train['SalePrice'], ax=ax[0])
sns.scatterplot(x=train['Gr Liv Area'], y=train['SalePrice'], ax=ax[1])
sns.scatterplot(x=train['Overall Cond'], y=train['SalePrice'], ax=ax[2])
plt.show()

How to get rid of plots under mainplot in Seaborn?

Trying to plot linear regression-plot with Seaborn and I am ending up having this:
and under it these empty plots:
I don't need the last 3 small subplots, or at least how to get them plotted correctly, with the main first 3 subplots above?
Here is the code I used:
fig, axes = plt.subplots(3, 1, figsize=(12, 15))
for col, ax in zip(['gross_sqft_thousands','land_sqft_thousands','total_units'], axes.flatten()):
ax.tick_params(axis='x', rotation=85)
ax.set_ylabel(col, fontsize=15)
sns.jointplot(x="sale_price_millions", y=col, data=clean_df, kind='reg', joint_kws={'line_kws':{'color':'cyan'}}, ax=ax)
fig.suptitle('Sale Price vs Continuous Variables', position=(.5,1.02), fontsize=20)
fig.tight_layout()
fig.show()

plot ordering/layering julia pyplot

I have a subplot that plots a line (x,y) and a particular point (xx,yy). I want to highligh (xx,yy), so I've plotted it with scatter. However, even if I order it after the original plot, the new point still shows up behind the original line. How can I fix this? MWE below.
x = 1:10
y = 1:10
xx = 5
yy = 5
fig, ax = subplots()
ax[:plot](x,y)
ax[:scatter](xx,yy, color="red", label="h_star", s=100)
legend()
xlabel("x")
ylabel("y")
title("test")
grid("on")
You can change which plots are displayed on top of each other with the argument zorder. The matplotlib example shown here gives a brief explanation:
The default drawing order for axes is patches, lines, text. This
order is determined by the zorder attribute. The following defaults
are set
Artist Z-order
Patch / PatchCollection 1
Line2D / LineCollection 2
Text 3
You can change the order for individual artists by setting the zorder.
Any individual plot() call can set a value for the zorder of that
particular item.
A full example based on the code in the question, using python is shown below:
import matplotlib.pyplot as plt
x = range(1,10)
y = range(1,10)
xx = 5
yy = 5
fig, ax = plt.subplots()
ax.plot(x,y)
# could set zorder very high, say 10, to "make sure" it will be on the top
ax.scatter(xx,yy, color="red", label="h_star", s=100, zorder=3)
plt.legend()
plt.xlabel("x")
plt.ylabel("y")
plt.title("test")
plt.grid("on")
plt.show()
Which gives: