Secondary y axis limit in pandas plot - pandas

Is there a way to set the limit for the secondary Y axis in pandas df.plot
I have the following plotting statement. Is there a way to simply add ylim for secondary axis? as in "secondary_ylim=(0,1)"
df[["Date","Col1","Col2"]].plot(x="date",y=["Col1","Col2"],secondary_y="Col2",ylim = (0,1))

Interesting.... I don't know if there is another way to get the axes for the secondary y_axes.
But, you could do it this way:
df = pd.DataFrame({'Date':pd.date_range('2019-02-01', periods=10), 'Col1':np.random.randint(0,10,10), 'Col2':np.random.randint(100,500, 10)})
ax = df[["Date","Col1","Col2"]].plot(x="Date",y=["Col1","Col2"],secondary_y="Col2", ylim = ([0,5]))
ax.set_ylim(0,5)
fig = ax.get_figure()
ax = fig.get_axes()
ax[1].set_ylim(0,250)
or as #Stef points out, you can use the right_ax
ax = df[["Date","Col1","Col2"]].plot(x="Date",y=["Col1","Col2"],secondary_y="Col2", ylim = ([0,5]))
ax.right_ax.set_ylim(0,250)
Output:

Related

y and x axis subplots matplotlib

A quite basic question about ticks' labels for x and y-axis. According to this code
fig, axes = plt.subplots(6,12, figsize=(50, 24), constrained_layout=True, sharex=True , sharey=True)
fig.subplots_adjust(hspace = .5, wspace=.5)
custom_xlim = (-1, 1)
custom_ylim = (-0.2,0.2)
for i in range(72):
x_data = ctheta[i]
y_data = phi[i]
y_err = err_phi[i]
ax = fig.add_subplot(6, 12, i+1)
ax.plot(x_data_new, bspl(x_data_new))
ax.axis('off')
ax.errorbar(x_data,y_data, yerr=y_err, fmt="o")
ax.set_xlim(custom_xlim)
ax.set_ylim(custom_ylim)
I get the following output:
With y labels for plots on the first column and x labels for theone along the last line, although I call them off.
Any idea?
As #BigBen wrote in their comment, your issue is caused by you adding axes to your figure twice, once via fig, axes = plt.subplots() and then once again within your loop via fig.add_subplot(). As a result, the first set of axes is still visible even after you applied .axis('off') to the second set.
Instead of the latter, you could change your loop to:
for i in range(6):
for j in range(12):
ax = axes[i,j] # these are the axes created via plt.subplots(6,12,...)
ax.axis('off')
# … your other code here

How to plot a spectrum with plotly

I want to plot a spectrum that is given by an array of masses and intensities. For each pair I want to plot a thin line. When I zoom in, the width of the lines should not change. The bar plot does almost what I need.
import plotly.graph_objects as go
fig = go.Figure(data=[go.Bar(
x=df['mz_array'],
y=df['intensity'],
width = 1
)])
fig.show()
However, when I zoom in the bars change their widths.
I modified the dataframe and then used px.line to plot the spectrum:
def plot_spectrum(df, annot_threshold=1e4, threshold=0):
df = df[df.intensity > threshold].reset_index()
df1 = df.copy()
df1['Text'] = df1.mz_array.astype(str)
df1.loc[df1.intensity < annot_threshold, 'Text'] = None
df1.Text.notnull().sum()
df2 = df.copy()
df2['intensity'] = 0
df3 = pd.concat([df1, df2]).sort_values(['index', 'intensity'])
fig = px.line(df3, x='mz_array', y='intensity', color='index', text='Text')
fig.update_layout(showlegend=False)
fig.update_traces(line=dict(width=1, color='grey'))
fig.update_traces(textposition='top center')
return fig

plot 3d scatter plot from pandas data frame and colour by group

I am doing a PCA, and have 10 components. I would like to plot the first 3 components and colour according to their group type.
from mpl_toolkits.mplot3d import Axes3D
df=pd.DataFrame(np.random.rand(30,20))
grp=round(pd.DataFrame(np.random.rand(30)*10),0)
df['grp']=grp
fig = plt.figure(figsize=(12, 9))
ax = Axes3D(fig)
y = df.iloc[:,1]
x = df.iloc[:,0]
z = df.iloc[:,2]
c = df['grp']
ax.scatter(x,y,z, c=c, cmap='coolwarm')
plt.title('First 3 Principal Components')
ax.set_ylabel('PC2')
ax.set_xlabel('PC1')
ax.set_zlabel('PC3')
plt.legend()
this works, but unfortunately does not show a legend, nor I believe all of the possible groups.
Any suggestions
Check out pandas groupby, grouping you data by groups and plot your groups individually:
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure(figsize=(12, 9))
ax = Axes3D(fig)
for grp_name, grp_idx in df.groupby('grp').groups.items():
y = df.iloc[grp_idx,1]
x = df.iloc[grp_idx,0]
z = df.iloc[grp_idx,2]
ax.scatter(x,y,z, label=grp_name) # this way you can control color/marker/size of each group freely
ax.scatter(*df.iloc[grp_idx, [0, 1, 2]].T.values, label=grp_name) # if you want to do everything in one line, lol
ax.legend()
You can try adding a colorbar:
# note the difference
cb = ax.scatter(x,y,z, c=c, cmap='coolwarm')
plt.title('First 3 Principal Components')
ax.set_ylabel('PC2')
ax.set_xlabel('PC1')
ax.set_zlabel('PC3')
# and here we add a colorbar
plt.colorbar(cb)
plt.legend()

Matplotlib set x tick labels does not swap order

I want to make a line graph where essentially (Dog,1), (Cat,2), (Bird,3) and so on are plotted and connected by line. In additional, I would like to be able to determine the order of the label in the X axis. Matplotlib auto-plotted with the order 'Dog', 'Cat', and 'Bird' label. Despite my attempt at re-arranging the order to 'Dog','Bird','Giraffe','Cat', the graph doesn't change (see image). What should I do to be able to arrange the graph accordingly?
x = ['Dog','Cat','Bird','Dog','Cat','Bird','Dog','Cat','Cat','Cat']
y = [1,2,3,4,5,6,7,8,9,10]
x_ticks_labels = ['Dog','Bird','Giraffe','Cat']
fig, ax = plt.subplots(1,1)
ax.plot(x,y)
# Set number of ticks for x-axis
ax.set_xticks(range(len(x_ticks_labels)))
# Set ticks labels for x-axis
ax.set_xticklabels(x_ticks_labels)
Use matplotlib's categorical feature
You may predetermine the order of categories on the axes by first plotting something in the correct order then removing it again.
import numpy as np
import matplotlib.pyplot as plt
x = ['Dog','Cat','Bird','Dog','Cat','Bird','Dog','Cat','Cat','Cat']
y = [1,2,3,4,5,6,7,8,9,10]
x_ticks_labels = ['Dog','Bird','Giraffe','Cat']
fig, ax = plt.subplots(1,1)
sentinel, = ax.plot(x_ticks_labels, np.linspace(min(y), max(y), len(x_ticks_labels)))
sentinel.remove()
ax.plot(x,y, color="C0", marker="o")
plt.show()
Determine indices of values
The other option is to determine the indices that the values from x would take inside of x_tick_labels. There is unfortunately no canonical way to do so; here I take the
solution from this answer using np.where. Then one can simply plot the y values against those indices and set the ticks and ticklabels accordingly.
import numpy as np
import matplotlib.pyplot as plt
x = ['Dog','Cat','Bird','Dog','Cat','Bird','Dog','Cat','Cat','Cat']
y = [1,2,3,4,5,6,7,8,9,10]
x_ticks_labels = ['Dog','Bird','Giraffe','Cat']
xarr = np.array(x)
ind = np.where(xarr.reshape(xarr.size, 1) == np.array(x_ticks_labels))[1]
fig, ax = plt.subplots(1,1)
ax.plot(ind,y, color="C0", marker="o")
ax.set_xticks(range(len(x_ticks_labels)))
ax.set_xticklabels(x_ticks_labels)
plt.show()
Result in both cases

mortifying Axes objects returned by plt.subplots() with a for loop

I'm trying to use for loop to fill each Axes in a subplots by the following code:
df = sns.load_dataset('iris')
cols = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']
# plotting
fig, ax = plt.subplots(2,2)
for ax_row in range(2):
for ax_col in range(2):
for col in cols:
sns.distplot(df[col], ax=ax[ax_row][ax_col])
But I got the same plot in all four axes. How should I change it to make it work?
The problem is for col in cols: where you are looping through all the columns for each subplot. What you instead need is to plot one column at a time in one subplot. To do so, one way is to use an index i and keep updating it as you loop through the subplots. Below is the answer:
import seaborn as sns
df = sns.load_dataset('iris')
cols = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']
# plotting
fig, ax = plt.subplots(2,2, figsize=(8, 6))
i = 0
for ax_row in range(2):
for ax_col in range(2):
ax_ = sns.distplot(df[cols[i]], ax=ax[ax_row][ax_col])
i += 1
plt.tight_layout()
EDIT: Using enumerate
fig, ax = plt.subplots(2,2, figsize=(8, 6))
for i, axis in enumerate(ax.flatten()):
ax_ = sns.distplot(df[cols[i]], ax=axis)
plt.tight_layout()
EDIT 2: Using enumerate on cols
fig, axes = plt.subplots(2,2, figsize=(8, 6))
for i, col in enumerate(cols):
ax_ = sns.distplot(df[col], ax=axes.flatten()[i])
plt.tight_layout()