Matplotlib annotate doesn't work on log scale? - matplotlib

I am making log-log plots for different data sets and need to include the best fit line equation. I know where in the plot I should place the equation, but since the data sets have very different values, I'd like to use relative coordinates in the annotation. (Otherwise, the annotation would move for every data set.)
I am aware of the annotate() function of matplotlib, and I know that I can use textcoords='axes fraction' to enable relative coordinates. When I plot my data on the regular scale, it works. But then I change at least one of the scales to log and the annotation disappears. I get no error message.
Here's my code:
plt.clf()
samplevalues = [100,1000,5000,10^4]
ax = plt.subplot(111)
ax.plot(samplevalues,samplevalues,'o',color='black')
ax.annotate('hi',(0.5,0.5), textcoords='axes fraction')
ax.set_xscale('log')
ax.set_yscale('log')
plt.show()
If I comment out ax.set_xcale('log') and ax.set_ycale('log'), the annotation appears right in the middle of the plot (where it should be). Otherwise, it doesn't appear.
Thanks in advance for your help!

It may really be a bug as pointed out by #tcaswell in the comment but a workaround is to use text() in axis coords:
plt.clf()
samplevalues = [100,1000,5000,10^4]
ax = plt.subplot(111)
ax.loglog(samplevalues,samplevalues,'o',color='black')
ax.text(0.5, 0.5,'hi',transform=ax.transAxes)
plt.show()
Another approach is to use figtext() but that is more cumbersome to use if there are already several plots (panels).
By the way, in the code above, I plotted the data using log-log scale directly. That is, instead of:
ax.plot(samplevalues,samplevalues,'o',color='black')
ax.set_xscale('log')
ax.set_yscale('log')
I did:
ax.loglog(samplevalues,samplevalues,'o',color='black')

Related

Turn off x-axis marginal distribution axes on jointplot using seaborn package

There is a similar question here, however I fail to adapt the provided solutions to my case.
I want to have a jointplot with kind=hex while removing the marginal plot of the x-axis as it contains no information. In the linked question the suggestion is to use JointGrid directly, however Seaborn then seems to to be unable to draw the hexbin plot.
joint_kws = dict(gridsize=70)
g = sns.jointplot(data=all_data, x="Minute of Hour", y="Frequency", kind="hex", joint_kws=joint_kws)
plt.ylim([49.9, 50.1])
plt.xlim([0, 60])
g.ax_joint.axvline(x=30,ymin=49, ymax=51)
plt.show()
plt.close()
How to remove the margin plot over the x-axis?
Why is the vertical line not drawn?
Also is there a way to exchange the right margin to a plot which more clearly resembles the density?
edit: Here is a sample of the dataset (33kB). Read it with pd.read_pickle("./data.pickle")
I've been fiddling with an analog problem (using a scatterplot instead of the hexbin). In the end, the solution to your first point is awkwardly simple. Just add this line :
g.ax_marg_x.remove()
Regarding your second point, I've no clue as to why no line is plotted. But a workaround seems to be to use vlines instead :
g.ax_joint.vlines(x=30, ymin=49, ymax=51)
Concerning your last point, I'm afraid I haven't understood it. If you mean increasing/reducing the margin between the subplots, you can use the space argument stated in the doc.

Matplotlib/Seaborn: Boxplot collapses on x axis

I am creating a series of boxplots in order to compare different cancer types with each other (based on 5 categories). For plotting I use seaborn/matplotlib. It works fine for most of the cancer types (see image right) however in some the x axis collapses slightly (see image left) or strongly (see image middle)
https://i.imgur.com/dxLR4B4.png
Looking into the code how seaborn plots a box/violin plot https://github.com/mwaskom/seaborn/blob/36964d7ffba3683de2117d25f224f8ebef015298/seaborn/categorical.py (line 961)
violin_data = remove_na(group_data[hue_mask])
I realized that this happens when there are too many nans
Is there any possibility to prevent this collapsing by code only
I do not want to modify my dataframe (replace the nans by zero)
Below you find my code:
boxp_df=pd.read_csv(pf_in,sep="\t",skip_blank_lines=False)
fig, ax = plt.subplots(figsize=(10, 10))
sns.violinplot(data=boxp_df, ax=ax)
plt.xticks(rotation=-45)
plt.ylabel("label")
plt.tight_layout()
plt.savefig(pf_out)
The output is a per cancer type differently sized plot
(depending on if there is any category completely nan)
I am expecting each plot to be in the same width.
Update
trying to use the order parameter as suggested leads to the following output:
https://i.imgur.com/uSm13Qw.png
Maybe this toy example helps ?
|Cat1|Cat2|Cat3|Cat4|Cat5
|3.93| |0.52| |6.01
|3.34| |0.89| |2.89
|3.39| |1.96| |4.63
|1.59| |3.66| |3.75
|2.73| |0.39| |2.87
|0.08| |1.25| |-0.27
Update
Apparently, the problem is not the data but the length of the title
https://github.com/matplotlib/matplotlib/issues/4413
Therefore I would close the question
#Diziet should I delete it or does my issue might help other ones?
Sorry for not including the line below in the code example:
ax.set_title("VERY LONG TITLE", fontsize=20)
It's hard to be sure without data to test it with, but I think you can pass the names of your categories/cancers to the order= parameter. This forces seaborn to use/display those, even if they are empty.
for instance:
tips = sns.load_dataset("tips")
ax = sns.violinplot(x="day", y="total_bill", data=tips, order=['Thur','Fri','Sat','Freedom Day','Sun','Durin\'s Day'])

matplotlib pyplot side-by-side graphics

I'm trying to put two scatterplots side-by-side in the same figure. I'm also using prettyplotlib to make the graphs look a little nicer. Here is the code
fig, ax = ppl.subplots(ncols=2,nrows=1,figsize=(14,6))
for each in ['skimmer','dos','webapp','losstheft','espionage','crimeware','misuse','pos']:
ypos = df[df['pattern']==each]['ypos_m']
xpos = df[df['pattern']==each]['xpos_m']
ax[0] = ppl.scatter(ypos,xpos,label=each)
plt.title("Multi-dimensional Scaling: Manhattan")
for each in ['skimmer','dos','webapp','losstheft','espionage','crimeware','misuse','pos']:
ypos = df[df['pattern']==each]['ypos_e']
xpos = df[df['pattern']==each]['xpos_e']
ax[1] = ppl.scatter(ypos,xpos,label=each)
plt.title("Multi-dimensional Scaling: Euclidean")
plt.show()
I don't get any error when the code runs, but what I end up with is one row with two graphs. One graph is completely empty and not styled by prettyplotlib at all. The right side graphic seems to have both of my scatterplots in it.
I know that ppl.subplots is returning a matplotlib.figure.Figure and a numpy array consisting of two matplotlib.axes.AxesSubplot. But I also admit that I don't quite get how axes and subplotting works. Hopefully it's just a simple mistake somewhere.
I think ax[0] = ppl.scatter(ypos,xpos,label=each) should be ax[0].scatter(ypos,xpos,label=each) and ax[1] = ppl.scatter(ypos,xpos,label=each) should be ax[1].scatter(ypos,xpos,label=each), change those and see if your problem get solved.
I am quite sure that the issue is: you are calling ppl.scatter(...), which will try to draw on the current axis, which is the 1st axes of 2 axes you generated (and it is the left one)
Also you may find that in the end, the ax list contains two matplotlib.collections.PathCollections, bot two axis as you may expect.
Since the solution above removes the prettiness of prettyplot, we shall use an alternative solution, which is to change the current working axis, by adding:
plt.sca(ax[0_or_1])
Before ppl.scatter(), inside each loop.

Add a new axis to the right/left/top-right of an axis

How do you add an axis to the outside of another axis, keeping it within the figure as a whole? legend and colorbar both have this capability, but implemented in rather complicated (and for me, hard to reproduce) ways.
You can use the subplots command to achieve this, this can be as simple as py.subplot(2,2,1) where the first two numbers describe the geometry of the plots (2x2) and the third is the current plot number. In general it is better to be explicit as in the following example
import pylab as py
# Make some data
x = py.linspace(0,10,1000)
cos_x = py.cos(x)
sin_x = py.sin(x)
# Initiate a figure, there are other options in addition to figsize
fig = py.figure(figsize=(6,6))
# Plot the first set of data on ax1
ax1 = fig.add_subplot(2,1,1)
ax1.plot(x,sin_x)
# Plot the second set of data on ax2
ax2 = fig.add_subplot(2,1,2)
ax2.plot(x,cos_x)
# This final line can be used to adjust the subplots, if uncommentted it will remove all white space
#fig.subplots_adjust(left=0.13, right=0.9, top=0.9, bottom=0.12,hspace=0.0,wspace=0.0)
Notice that this means things like py.xlabel may not work as expected since you have two axis. Instead you need to specify ax1.set_xlabel("..") this makes the code easier to read.
More examples can be found here.

Change figsize in matplotlib polar contourf

I am using the following example Example to create two polar contour subplots. When I create as the pdf there is a lot of white space which I want to remove by changing figsize.
I know how to change figsize usually but I am having difficulty seeing where to put it in this code example. Any guidance or hint would be greatly appreciated.
Many thanks!
import numpy as np
import matplotlib.pyplot as plt
#-- Generate Data -----------------------------------------
# Using linspace so that the endpoint of 360 is included...
azimuths = np.radians(np.linspace(0, 360, 20))
zeniths = np.arange(0, 70, 10)
r, theta = np.meshgrid(zeniths, azimuths)
values = np.random.random((azimuths.size, zeniths.size))
#-- Plot... ------------------------------------------------
fig, ax = plt.subplots(subplot_kw=dict(projection='polar'))
ax.contourf(theta, r, values)
plt.show()
Another way to do this would be to use the figsize kwarg in your call to plt.subplots.
fig, ax = plt.subplots(figsize=(6,6), subplot_kw=dict(projection='polar')).
Those values are in inches, by the way.
You can easily just put plt.figsize(x,y) at the beginning of the code, and it will work. plt.figsize changes the size of all future plots, not just the current plot.
However, I think your problem is not what you think it is. There tends to be quite a bit of whitespace in generated PDFs unless you change options around. I usually use
plt.savefig( 'name.pdf', bbox_inches='tight', pad_inches=0 )
This gives as little whitespace as possible. bbox_inches='tight' tries to make the bounding box as small as possible, while pad_inches sets how many inches of whitespace there should be padding it. In my case I have no extra padding at all, as I add padding in whatever I'm using the figure for.