How do I display only every nth axis label - matplotlib

I am using matplotlib to make a scatter plot, and the x-axis labels are running together to the point they are illegible. Here's all the relevant code:
plt.xticks(rotation=30)
plt.scatter(x,y)
plt.show()
x and y are lists of x-axis values and y-axis values, respectively.
This SO post (matplotlib: how to prevent x-axis labels from overlapping each other) asks the same question, but if there's an answer anywhere in there, I can't tease it out.
This SO post (Cleanest way to hide every nth tick label in matplotlib colorbar?) asks a similar question in the context of colorbars. All of the responses that seem to work for people are of the form
for label in cbar.ax.xaxis.get_ticklabels()[::2]:
label.set_visible(False)
or
plt.setp(cbar.ax.get_xticklabels()[::2], visible=False)
where cbar is the asker's colorbar object. Every time I try to adapt these solutions to my case, for example
plt.xticks(rotation=30)
plot = plt.scatter(x,y)
plt.setp(plot.get_xticklabels()[::2], visible=False)
plt.show()
I get errors like
AttributeError: 'PathCollection' object has no attribute 'get_xticklabels'.
Similar to the above, if I try plot.ax.get_xticklabels() I get AttributeError: 'PathCollection' object has no attribute 'ax', etc.
How do I show only every nth axis label?
Edit This worked: first, set all labels to not visible, then make every N labels visible
plot = plt.scatter(x,y)
plt.setp(plot.axes.get_xticklabels(), visible=False)
plt.setp(plot.axes.get_xticklabels()[::5], visible=True)
The above does every 5th label; change it to whatever you need.

Related

What does ax=ax do while creating a plot in matplotlib?

I have a DataFrame of Heart Disease patients, which has over 300 values. What I have done initially is filter the patients aging over 50. Now I am trying to plot that DF, but running on Google, I found this piece of code that helped me plotting it.
But I am not able to understand the concept of ax = ax here:
fig, ax = plt.subplots()
over_50.plot(x="age",
y="chol",
c="target",
kind="scatter",
---------> ax=ax); <---------
I want to learn the concept behind this little piece of code here. What is it doing at its core?
In this case (a single axes plot) you can do without this parameter.
But there are more complex cases, when you create subplots with
a number of axes objects (a grid).
In this case ax (the second result from plt.subplots()) is an array
of axes objects.
Then, creating each plot, you should specify in which axes this plot
is to be created.
See e.g. https://matplotlib.org/3.1.0/gallery/subplots_axes_and_figures/subplots_demo.html
and find title Stacking subplots in one direction.
It contains such example:
fig, axs = plt.subplots(2)
fig.suptitle('Vertically stacked subplots')
axs[0].plot(x, y)
axs[1].plot(x, -y)
Here:
there is created a figure composed of 2 columns,
in the first axes there is created one line plot, and in the second - another plot.
Alternative form of how to specify axes object in which particular plot
is to be created is just ax parameter, like in our code,
where you can pass one of axes objects from the current figure.

Matplotlib/Seaborn: Boxplot collapses on x axis

I am creating a series of boxplots in order to compare different cancer types with each other (based on 5 categories). For plotting I use seaborn/matplotlib. It works fine for most of the cancer types (see image right) however in some the x axis collapses slightly (see image left) or strongly (see image middle)
https://i.imgur.com/dxLR4B4.png
Looking into the code how seaborn plots a box/violin plot https://github.com/mwaskom/seaborn/blob/36964d7ffba3683de2117d25f224f8ebef015298/seaborn/categorical.py (line 961)
violin_data = remove_na(group_data[hue_mask])
I realized that this happens when there are too many nans
Is there any possibility to prevent this collapsing by code only
I do not want to modify my dataframe (replace the nans by zero)
Below you find my code:
boxp_df=pd.read_csv(pf_in,sep="\t",skip_blank_lines=False)
fig, ax = plt.subplots(figsize=(10, 10))
sns.violinplot(data=boxp_df, ax=ax)
plt.xticks(rotation=-45)
plt.ylabel("label")
plt.tight_layout()
plt.savefig(pf_out)
The output is a per cancer type differently sized plot
(depending on if there is any category completely nan)
I am expecting each plot to be in the same width.
Update
trying to use the order parameter as suggested leads to the following output:
https://i.imgur.com/uSm13Qw.png
Maybe this toy example helps ?
|Cat1|Cat2|Cat3|Cat4|Cat5
|3.93| |0.52| |6.01
|3.34| |0.89| |2.89
|3.39| |1.96| |4.63
|1.59| |3.66| |3.75
|2.73| |0.39| |2.87
|0.08| |1.25| |-0.27
Update
Apparently, the problem is not the data but the length of the title
https://github.com/matplotlib/matplotlib/issues/4413
Therefore I would close the question
#Diziet should I delete it or does my issue might help other ones?
Sorry for not including the line below in the code example:
ax.set_title("VERY LONG TITLE", fontsize=20)
It's hard to be sure without data to test it with, but I think you can pass the names of your categories/cancers to the order= parameter. This forces seaborn to use/display those, even if they are empty.
for instance:
tips = sns.load_dataset("tips")
ax = sns.violinplot(x="day", y="total_bill", data=tips, order=['Thur','Fri','Sat','Freedom Day','Sun','Durin\'s Day'])

How to get legend next to plot in Seaborn?

I am plotting a relplot with Seaborn, but getting the legend (and an empty axis plot) printed under the main plot.
Here is how it looks like (in 2 photos, as my screen isn't that big):
Here is the code I used:
fig, axes = plt.subplots(1, 1, figsize=(12, 5))
clean_df['tax_class_at_sale'] = clean_df['tax_class_at_sale'].apply(str)
sns.relplot(x="sale_price_millions", y='gross_sqft_thousands', hue="neighborhood", data=clean_df, ax=axes)
fig.suptitle('Sale Price by Neighborhood', position=(.5,1.05), fontsize=20)
fig.tight_layout()
fig.show()
Does someone has an idea how to fix that, so that the legend (maybe much smaller, but it's not a problem) is printed next to the plot, and the empty axis disappears?
Here is my dataset form (in 2 screenshot, to capture all columns. "sale_price_millions" is the target column)
Since you failed to provide a Minimal, Complete, and Verifiable example, no one can give you a final working answer because we can't reproduce your figure. Nevertheless, you can try specifying the location for placing the legend as following and see if it works as you want
sns.relplot(x="sale_price_millions", y='gross_sqft_thousands', hue="neighborhood", data=clean_df, ax=axes)
plt.legend(loc=(1.05, 0.5))

colorbars for grid of line (not contour) plots in matplotlib

I'm having trouble giving colorbars to a grid of line plots in Matplotlib.
I have a grid of plots, which each shows 64 lines. The lines depict the penalty value vs time when optimizing the same system under 64 different values of a certain hyperparameter h.
Since there are so many lines, instead of using a standard legend, I'd like to use a colorbar, and color the lines by the value of h. In other words, I'd like something that looks like this:
The above was done by adding a new axis to hold the colorbar, by calling figure.add_axes([0.95, 0.2, 0.02, 0.6]), passing in the axis position explicitly as parameters to that method. The colorbar was then created as in the example code here, by instantiating a ColorbarBase(). That's fine for single plots, but I'd like to make a grid of plots like the one above.
To do this, I tried doubling the number of subplots, and using every other subplot axis for the colorbar. Unfortunately, this led to the colorbars having the same size/shape as the plots:
Is there a way to shrink just the colorbar subplots in a grid of subplots like the 1x2 grid above?
Ideally, it'd be great if the colorbar just shared the same axis as the line plot it describes. I saw that the colorbar.colorbar() function has an ax parameter:
ax
parent axes object from which space for a new colorbar axes will be stolen.
That sounds great, except that colorbar.colorbar() requires you to pass in a imshow image, or a ContourSet, but my plot is neither an image nor a contour plot. Can I achieve the same (axis-sharing) effect using ColorbarBase?
It turns out you can have different-shaped subplots, so long as all the plots in a given row have the same height, and all the plots in a given column have the same width.
You can do this using gridspec.GridSpec, as described in this answer.
So I set the columns with line plots to be 20x wider than the columns with color bars. The code looks like:
grid_spec = gridspec.GridSpec(num_rows,
num_columns * 2,
width_ratios=[20, 1] * num_columns)
colormap_type = cm.cool
for (x_vec_list,
y_vec_list,
color_hyperparam_vec,
plot_index) in izip(x_vec_lists,
y_vec_lists,
color_hyperparam_vecs,
range(len(x_vecs))):
line_axis = plt.subplot(grid_spec[grid_index * 2])
colorbar_axis = plt.subplot(grid_spec[grid_index * 2 + 1])
colormap_normalizer = mpl.colors.Normalize(vmin=color_hyperparam_vec.min(),
vmax=color_hyperparam_vec.max())
scalar_to_color_map = mpl.cm.ScalarMappable(norm=colormap_normalizer,
cmap=colormap_type)
colorbar.ColorbarBase(colorbar_axis,
cmap=colormap_type,
norm=colormap_normalizer)
for (line_index,
x_vec,
y_vec) in zip(range(len(x_vec_list)),
x_vec_list,
y_vec_list):
hyperparam = color_hyperparam_vec[line_index]
line_color = scalar_to_color_map.to_rgba(hyperparam)
line_axis.plot(x_vec, y_vec, color=line_color, alpha=0.5)
For num_rows=1 and num_columns=1, this looks like:

matplotlib tick labels position relative to axes

I set matplotlib to put ticks outside the plot area but now they overlap on the corresponding labels. the tick_params method does not provide any option to set the corresponding labels position.
So I guess I will have to write my own function using text() method. In the meanwhile does any one has a better suggestion?
To shift the tick labels relative to the ticks use pad. Compare
ax.tick_params(direction='out', pad=5)
plt.draw()
with
ax.tick_params(direction='out', pad=15)
plt.draw()