Aliasing when saving matplotlib filled contour plot to .pdf or .eps - pdf

I'm generating a filed contour plot with the matplotlib.pyplot.contourf() function. The arguments in the call to the function are:
contourf(xvec,xvec,w,levels,cmap=matplotlib.cm.jet)
where
xvec = numpy.linspace(-3.,3.,50)
levels = numpy.linspace(-0.01,0.25,100)
and w is my data.
The resulting plot looks pretty good on screen, but when I save to pdf using a call to matplotlib.pyplot.savefig(), the resulting pdf has a lot of aliasing (I think that is what it is) going on. The call to savefig is simply savefig('filename.pdf'). I have tried using the dpi argument, but without luck. A call to matplotlib.get_backend() spits out 'TkAgg'.
I will attach a figure saved as pdf, compared to a figure saved as png (similar to what it looks like on screen) to demonstrate the problem:
png wihtout aliasing: https://dl.dropbox.com/u/6042643/wigner_g0.17.png
pdf with aliasing: https://dl.dropbox.com/u/6042643/wigner_g0.17.pdf
Please let me know if there are any other details I could give to help you give an answer. I should mention that saving as .eps gives similar bad results as saving to pdf. But the pdf shows the problem even clearer. My goal is to end up with a production quality .eps that I can attach to a latex document to be published as a scientific paper. I would be happy with some kind of work around where I save in one format, then convert it, if I can find a way that gives satisfying results.
Best,
Arne

After using the useful answer by #pelson for a while, I finally found a proper solution to this long-standing problem (currently in Matplotlib 3), which does not require multiple calls to contour or rasterizing the figure.
I refer to my original answer here for a more extensive explanation and examples.
In summary, the solution consists of the following lines:
cnt = plt.contourf(x, y, z)
for c in cnt.collections:
c.set_edgecolor("face")
plt.savefig('test.pdf')

I had no idea that contouring in pdf was so bad. You're right, I think the contours are being anti-aliased by the PDF renderers outside of matplotlib. It is for this reason I think you need to be particularly careful which application you use to view the resulting PDF - the best behaviour I have seen is with GIMP, but I'm sure there are plenty of other viewers which perform well.
To fix this problem (when viewing the PDF with GIMP), I was able to "rasterize" the contours produced with matplotlib to avoid the ugly white line problem:
import matplotlib.pyplot as plt
import numpy as np
xs, ys = np.mgrid[0:30, 0:40]
data = (xs - 15) ** 2 + (ys - 20) ** 2 + (np.sin(ys) + 10) ** 2
cs = plt.contourf(xs, ys, data, 60, cmap='jet')
# Rasterize the contour collections
for c in cs.collections:
c.set_rasterized(True)
plt.savefig('test.pdf')
This produced a contour plot which did not exhibit the problems you've shown.
Another alternative, perhaps better, approach, would be to fool the anti-aliasing by putting coloured lines below the contourf.
import matplotlib.pyplot as plt
import numpy as np
xs, ys = np.mgrid[0:30, 0:40]
data = (xs - 15) ** 2 + (ys - 20) ** 2 + (np.sin(ys) + 10) ** 2
# contour the plot first to remove any AA artifacts
plt.contour(xs, ys, data, 60, cmap='jet', lw=0.1)
cs = plt.contourf(xs, ys, data, 60, cmap='jet')
plt.savefig('test.pdf')
I should note that I don't see these problems if I save the figure as a ".ps" rather than a ".pdf" - perhaps that is a third alternative.
Hope this helps you get the paper looking exactly how you want it.

Related

How to export svg in matplotlib with correct mm scale

I am trying to export a figure from matplotlib for laser cutting. The figure is plotted with millimeters as the units.
I'm tying to ensure the correct scale by getting the bounding box in inches and then setting the figure size to that value:
import matplotlib.pyplot as plt
ax = plt.subplot(111)
<snipped for brevity...plotting of lines and paths>
x_bound = map(mm_to_inch, ax.get_xbound())
y_bound = map(mm_to_inch, ax.get_ybound())
plt.gcf().set_size_inches(x_bound[1] - x_bound[0], y_bound[1] - y_bound[0])
plt.axis('off')
plt.savefig('{0}.svg'.format(self.name, format='svg'))
The exported .svg is ~2/3rds of the intended scale and I'm not familiar enough with axes and figures to know why. Additionally, there is a black border around the intended geometry. Here is some example output:
.svg output (converted to .png)
How should I remove the black border and scale the .svg correctly?
You probably want to remove the margins around the axes completely,
plt.gcf().subplots_adjust(0,0,1,1)
I might note however that the result may not be precise enough for the application. Definitely also consider creating the figure with a CAD program.
Based off of ImportanceOfBeingErnest's answer and some responses to other stackoverflow questions, the following solution works:
plt.axis('off')
plt.margins(0, 0)
plt.gca().xaxis.set_major_locator(plt.NullLocator())
plt.gca().yaxis.set_major_locator(plt.NullLocator())
plt.subplots_adjust(top=1, bottom=0, right=1, left=0, hspace=0, wspace=0)
x_bound = map(mm_to_inch, self._ax.get_xbound())
y_bound = map(mm_to_inch, self._ax.get_ybound())
plt.gcf().set_size_inches(x_bound[1] - x_bound[0], y_bound[1] - y_bound[0])

Difference between matplotlib.countourf and matlab.contourf() - odd sharp edges in matplotlib

I am a recent migrant from Matlab to Python and have recently worked with Numpy and Matplotlib. I recoded one of my scripts from Matlab, which employs Matlab's contourf-function, into Python using matplotlib's corresponding contourf-function. I managed to replicate the output in Python, apart that the contourf-plots are not exacly the same, for a reason that is unknown to me. As I run the contourf-function in matplotlib, I get this otherwise nice figure but it has these sharp edges on the contour-levels on top and bottom, which should not be there (see Figure 1 below, matplotlib-output). Now, when I export the arrays I used in Python to Matlab (i.e. the exactly same data set that was used to generate the matplotlib-contourf-plot) and use Matlab's contourf-function, I get a slightly different output, without those sharp contour-level edges (see Figure 2 below, Matlab-output). I used the same number of levels in both figures. In figure 3 I have made a scatterplot of the same data, which shows that there are no such sharp edges in the data as shown in the contourf-plot (I added contour-lines just for reference). Example dataset can be downloaded through Dropbox-link given below. The data set contains three txt-files: X, Y, Z. Each of them are an 500x500 arrays, which can be directly used with contourf(), i.e. plt.contourf(X,Y,Z,...). The code that used was
plt.contourf(X,Y,Z,10, cmap=plt.cm.jet)
plt.contour(X,Y,Z,10,colors='black', linewidths=0.5)
plt.axis('equal')
plt.axis('off')
Does anyone have an idea why this happens? I would appreciate any insight on this!
Cheers,
Jussi
Below are the details of my setup:
Python 3.7.0
IPython 6.5.0
matplotlib 2.2.3
Matplotlib output
Matlab output
Matplotlib-scatter
Link to data set
The confusing thing about the matlab plot is that its colorbar shows much more levels than there are actually in the plot. Hence you don't see the actual intervals that are contoured.
You would achieve the same result in matplotlib by choosing 12 instead of 11 levels.
import numpy as np
import matplotlib.pyplot as plt
X, Y, Z = [np.loadtxt("data/roundcontourdata/{}.txt".format(i)) for i in list("XYZ")]
levels = np.linspace(Z.min(), Z.max(), 12)
cntr = plt.contourf(X,Y,Z,levels, cmap=plt.cm.jet)
plt.contour(X,Y,Z,levels,colors='black', linewidths=0.5)
plt.colorbar(cntr)
plt.axis('equal')
plt.axis('off')
plt.show()
So in conclusion, both plots are correct and show the same data. Just the levels being automatically chosen are different. This can be circumvented by choosing custom levels depending on the desired visual appearance.

matplotlib figure tiny when using subplots

I'm trying to get a plot with custom aspect ratio to display properly. I am using Jupyter notebooks for the rendering, but the way I've normally done this is to adjust the 'figsize' attribute in the subplots. I've done it like below:
from matplotlib import pyplot as plt
fig,axes = plt.subplots(1,1,figsize=(16.0,8.0),frameon=False)
The problem is that, while the aspect ratio seems to come out correct (judging by eye), the figure does not use up even close to the whole page width, and is therefore tiny and hard to read.
I guess it's behaving like there are some sort of margins set on the left and right, but I can't find the global setting that controls this. I have been using the list of settings here, with no success finding a relevant one.
My question(s) are
How do I adjust the aspect ratio without impacting the overall size of the figure (think font sizes of the axis labels)? I don't need the width of my screen to be a constraint, I'd be perfectly happy for Jupyter notebooks to give me a horizontal scroll bar.
Is there a place with a more comprehensive and well-written documentation of all the matplotlib parameters that are available? The one I linked above is awkward because it gives the parameters in the form of an example matplotlibrc file. I'd like to know if a single page with (good) descriptions of all the parameters exists.
EDIT: it has been pointed out that this could be a jupyter problem and that I am setting the aspect ratio correctly. I'm using Jupyter version 1.0.0. Below is a picture of the output of a simplified notebook.
It's easy to see that the figure does not use even close to the available horizontal space.
The code in the notebook is:
#imports
import numpy as np
#set up a plot
import matplotlib as mpl
from matplotlib import pyplot as plt
#got smarter about the mpl config: see mplstyles/ directory
plt.style.use('standard')
#set up a 2-d plot
fig,axes = plt.subplots(1,1,figsize=(16.0,8.0),frameon=False)
ax1 = axes
#need to play with axis
mpl.rcParams['ytick.minor.visible'] = False
xmin = -10
xmax = 10
ymin = -10
ymax = 10
x = np.random.normal(0,5,(20000,))
y = np.random.normal(0,5,(20000,))
h = ax1.hist2d(x,y, bins=200, cmap='inferno')
ax1.set_xlim(xmin,xmax)
ax1.set_ylim(ymin,ymax)
ax1.set_xlabel('epoch time [Unix]',**axis_font)
ax1.set_ylabel(r'relative time [$\mu$s]',**axis_font)
ax1.grid(True)
#lgnd= ax1.legend(loc=2,prop={'size':22})
for axis in ['top','bottom','left','right']:
ax1.spines[axis].set_linewidth(2)
plt.tight_layout()
#plt.savefig('figures/thefigure.eps')
plt.show()
The mpl style file that I use in the plt.style.use command is:
#trying to customize here, see:
#https://matplotlib.org/users/customizing.html
#matplotlib.rc('figure', figsize=(3.4, 3.4*(4/6)))
lines.linewidth : 2
#ticks
xtick.top : False
xtick.bottom : True
xtick.minor.visible : True
xtick.direction : in
xtick.major.size : 8
xtick.minor.size : 4
xtick.major.width : 2
xtick.minor.width : 1
xtick.labelsize : 22
ytick.left : True
ytick.right : False
ytick.minor.visible : True
ytick.direction : in
ytick.major.size : 8
ytick.minor.size : 4
ytick.major.width : 2
ytick.minor.width : 1
ytick.labelsize : 22
#error bars
#errorbar.capsize : 3
#axis stuff
axes.labelsize : 22
EDIT 2: restricting the range of the vectors to the range of the axes before plotting results in the desired output. See the below figure:
The added/modified lines were:
xnew = x[(np.abs(x)<10) & (np.abs(y)<10)]
ynew = y[(np.abs(x)<10) & (np.abs(y)<10)]
h = ax1.hist2d(xnew,ynew, bins=200, cmap='inferno')
Apparently there was a bug in matplotlib 2.2.2 which got fixed by now in the development version. You may of course install the current development version from github.
The Problem comes from setting axes limits (ax1.set_xlim(-10,10)) which are smaller than the initial image. For some reason the original limits still got used to calculate the tight bbox for saving as png.
The workaround would be not to set any axes limits manually, but let the histogram plot be calculated directly with the desired limits in mind. In this case -10,10, e.g.:
x = np.random.normal(0,5,(20000,))
y = np.random.normal(0,5,(20000,))
bins = np.linspace(-10,10,201)
h = ax1.hist2d(x,y, bins=bins, cmap='inferno')
To change the font sizes of the axis label's, you'd have to use plt.rc or plt.rcParams (more on this here), so you needn't worry about doing that when using figsize.
I don't see any problems with the code you posted, could you post a picture of what you get and what you'd like to get? This is what I get using that configuration, on Jupyter notebooks, just plotting a very simple graph:
Do note, however, Jupyter limits the size of your plots automatically (see below):
And I'm afraid I can't help you with your second question, as I've always found matplotlib's documentation sufficient for all my needs... good luck!

Dotted line style from non-evenly distributed data

I'm new to Python and MatPlotlib.
This is my first posting to Stackoverflow - I've been unable to find the answer elsewhere and would be grateful for your help.
I'm using Windows XP, with Enthought Canopy v1.1.1 (32 bit).
I want to plot a dotted-style linear regression line through a scatter plot of data, where both x and y arrays contain random floating point data.
The dots in the resulting dotted line are not distributed evenly along the regression line, and are "smeared together" in the middle of the red line, making it look messy (see upper plot resulting from attached minimal example code).
This does not seem to occur if the items in the array of x values are evenly distributed (lower plot).
I'm therefore guessing that this is an issue with how MatplotLib renders dotted lines, or with how Canopy interfaces Python with Matplotlib.
Please could you tell me a workaround which will make the dots on the dotted line type appear evenly distributed; even if both x and y data are non-evenly distributed; whilst still using Canopy and Matplotlib?
(As a general point, I'm always keen to improve my coding skills - if any code in my example can be written more neatly or concisely, I'd be grateful for your expertise).
Many thanks in anticipation
Dave
(UK)
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
#generate data
x1=10 * np.random.random_sample((40))
x2=np.linspace(0,10,40)
y=5 * np.random.random_sample((40))
slope, intercept, r_value, p_value, std_err = stats.linregress(x1,y)
line = (slope*x1)+intercept
plt.figure(1)
plt.subplot(211)
plt.scatter(x1,y,color='blue', marker='o')
plt.plot(x1,line,'r:',label="Regression Line")
plt.legend(loc='upper right')
slope, intercept, r_value, p_value, std_err = stats.linregress(x2,y)
line = (slope*x2)+intercept
plt.subplot(212)
plt.scatter(x2,y,color='blue', marker='o')
plt.plot(x2,line,'r:',label="Regression Line")
plt.legend(loc='upper right')
plt.show()
Welcome to SO.
You have already identified the problem yourself, but seem a bit surprised that a random x-array results in the line be 'cluttered'. But you draw a dotted line repeatedly over the same location, so it seems like the normal behavior to me that it gets smeared at places where there are multiple dotted lines on top of each other.
If you don't want that, you can sort your array and use that to calculate the regression line and plot it. Since its a linear regression, just using the min and max values would also work.
x1_sorted = np.sort(x1)
line = (slope * x1_sorted) + intercept
or
x1_extremes = np.array([x1.min(),x1.max()])
line = (slope * x1_extremes) + intercept
The last should be faster if x1 becomes very large.
With regard to your last comment. In your example you use whats called the 'state-machine' environment for plotting. It means that specified commands are applied to the active figure and the active axes (subplots).
You can also consider the OO approach where you get figure and axes objects. This means you can access any figure or axes at any time, not just the active one. Its useful when passing an axes to a function for example.
In your example both would work equally well and it would be more a matter of taste.
A small example:
# create a figure with 2 subplots (2 rows, 1 column)
fig, axs = plt.subplots(2,1)
# plot in the first subplots
axs[0].scatter(x1,y,color='blue', marker='o')
axs[0].plot(x1,line,'r:',label="Regression Line")
# plot in the second
axs[1].plot()
etc...

eps cmyk color matplotlib output

I need to output my plots in EPS with the CMYK color space. Unfortunately this particular format is requested by the journal I am submitting my work to!
This discussion was the only one I could find that has addressed the issue but it is more than 2 years old. I was hoping there might be some updates fixing the problem by now.
All my programming is in Python3 and so far I have been saving my plots in PDF which had no problem. But now that I want to plot EPS there is a problem. For example the code bellow prints the simple plot in .png and .pdf but the .eps output is totally blank!
import numpy as np
import matplotlib.pyplot as plt
X=[1,2,3]
Y=[4,5,6]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(X,Y)
fig.savefig('test.eps')
fig.savefig('test.pdf')
fig.savefig('test.png')
So I have two questions:
How can I fix the eps output?
How can I set the eps output color space to CMYK?
Thanks in advance.
I too have the same problem. One workaround I have found is to save plots as .svg and then use a program like Inkscape to convert to eps. I used to be able to save in .eps without any issues and then lost the ability after an update.
Update I was able to solve this problem for my specific setup by changing a few lines in my .matplotlibrc, so I will post the relevant lines here in the hope that it may be helpful to you as well. Note this requires that you have xpdf and ghostscript already installed.
For me the important one was
##Saving Figures
ps.usedistiller : xpdf
But I also have
path.simplify : True
savefig.format : eps
Now I am able to save directly to .eps and include them in LaTeX'ed journal articles...