Some inline plots won't print from Ipython noteboook - matplotlib

I am using Ipython Notebook to generate some bar plots.
The code cell is this:
kcount =0;for k, v in pledge.groupby(['Date','Break']).sum().Amount.iteritems():
if k[0] <> kcount:
kcount=k[0]
pledge[pledge.Date==k[0]].groupby(['Break','Progcode'])['Amount'].sum().plot(kind='bar')
plt.title(k[0])
plt.figure()
This gives me a bar plot for every day of our pledge drive, showing how each show within that day did. 24 charts in all. They display great as output on the screen, but when I use the Print button in Ipython Notebook, it only prints enough graphs to fill the last page, which can vary from 3 to 6 graphs depending on the printer used. One printer used reported that it required 11x17 paper for the print job (not something I set anywhere) and when I manually set it for 8 1/2 x 11, it again only printed out the first 3 pages. I am at a loss as to what to do at this point.

As a workaround, can you can use plt.savefig('filename.png') (or .jpg, or .whatever) to save an image file and then print the files out manually?

I ended up saving these pages to a multipage PDF file and then printing them from there.
Consult the docs http://matplotlib.org/api/backend_pdf_api.html
To see how to save several figures to a multipage PDF file.
This also looks like a good resource. http://blog.marmakoide.org/?p=94

Related

How to add plot commands to a figure in more than one cell, but display it only in the end?

I want to do the following in a Jupyter Notebook:
Create a pyplot.figure in a cell;
For each subsequent cells, calculate values and plot them to that same figure without displaying anything;
At the end, in another cell, display the figure with the result of every previous plot command.
Currently, while using %matplotlib notebook, the figure is always displayed after the same cell it's been created, and I don't even call plt.show().
This is not the behavior I desire. Instead I would like to postpone the display of the figure for the last cell only, but the figure of course should contain the results of the sequential plot commands called in the cells in between.
You can capture the content of a cell of a jupyter notebook using the magic command %%capture. You can also hide any output of a specific line by putting a ; at the end of it.
Showing the figure can be done by simply typing the variable in which the figure is stored, e.g. fig.
Combining those techniques gives you
import matplotlib.pyplot as plt
%matplotlib notebook
%%capture captured
fig, ax=plt.subplots()
ax.plot([1,2,3]);
fig # now show the figure
which is probably more understandable in the acutal notebook like this:
Also see How to overlay plots from different cells?

Matplotlib video creation

EDIT: ImportanceOfBeingErnest provided the answer, however I am still inviting you all to explain, why is savefig logic different from animation logic.
I want to make a video in matplotlib. I went through manuals and examples and I just don't get it. (regarding matplotlib, I always copy examples, because after five years of python and two years of mathplotlib I still understand 0.0% of matplotlib syntax)
After half a dozen hours here is what I came up to. Well, I get empty video. No idea why.
import os
import math
import matplotlib
matplotlib.use("Agg")
from matplotlib import pyplot as plt
import matplotlib.animation as animation
# Set up formatting for the movie files
Writer = animation.writers['ffmpeg']
writer = Writer(fps=15, metadata=dict(artist='Me'), bitrate=1800)
numb=100
temp=[0.0]*numb
cont=[0.0]*numb
for i in range(int(4*numb/10),int(6*numb/10)):
temp[i]=2
cont[i]=2
fig = plt.figure()
plts=fig.add_subplot(1,1,1)
plts.set_ylim([0,2.1])
plts.get_xaxis().set_visible(False)
plts.get_yaxis().set_visible(False)
ims = []
for i in range(1,10):
line1, = plts.plot(range(0,numb),temp, linewidth=1, color='black')
line2, = plts.plot(range(0,numb),cont, linewidth=1, color='red')
# savefig is here for testing, works perfectly!
# fig.savefig('test'+str(i)+'.png', bbox_inches='tight', dpi=300)
ims.append([line1,line2])
plts.lines.remove(line1)
plts.lines.remove(line2)
for j in range(1,10):
tempa=0
for k in range (1,numb-1):
tempb=temp[k]+0.51*(temp[k-1]-2*temp[k]+temp[k+1])
temp[k-1]=tempa
tempa=tempb
temp[numb-1]=0
for j in range(1,20):
conta=0
for k in range (1,numb-1):
contb=cont[k]+0.255*(cont[k-1]-2*cont[k]+cont[k+1])
cont[k-1]=conta
conta=contb
cont[numb-1]=0
im_ani = animation.ArtistAnimation(fig, ims, interval=50, repeat_delay=3000,blit=True)
im_ani.save('im.mp4', writer=writer)
Can someone help me with this?
If you want to have a plot which is not empty, the main idea would be not to remove the lines from the plot.
That is, delete the two lines
plts.lines.remove(line1)
plts.lines.remove(line2)
If you delete these two lines the output will look something like this
[Link to orginial size animation]
Now one might ask, why do I not need to remove the artist in each iteration step, as otherwise all of the lines would populate the canvas at once?
The answer is that the ArtistAnimation takes care of this. It will only show those artists in the supplied list that correspond to the given time step. So while at the end of the for loop you end up with all the lines drawn to the canvas, once the animation starts they will all be removed and only one set of artists is shown at a time.
In such a case it is of course not a good idea to use the loop for saving the individual images as the final image would contain all of the drawn line at once,
The solution is then either to make two runs of the script, one for the animation, and one where the lines are removes in each timestep. Or, maybe better, use the animation istself to create the images.
im_ani.save('im.png', writer="imagemagick")
will create the images as im-<nr>.png in the current folder. It will require to have imagemagick installed.
I'm trying here to answer the two questions from the comments:
1. I have appended line1 and line2 before deleting them. Still they disappeared in the final result. How come?
You have appended the lines to a list. After that you removed the lines from the axes. Now the lines are in the list but not part of the axes. When doing the animation, matplotlib finds the lines in the list and makes them visible. But they are not in the axes (because they have been removed) so the visibility of some Line2D object, which does not live in any axes but only somewhere in memory, is changed. But that isn't reflected in the plot because the plot doesn't know this line any more.
2. If I understand right, when you issue line1, = plts.plot... command then the line1 plot object is added to the plts graph object. However, if you change the line1 plot object by issuing line1, = plts.plot... command again, matplotlib does change line1 object but before that saves the old line1 to the plts graph object permanently. Is this what caused my problem?
No. The first step is correct, line1, = plts.plot(..) adds a Line2D object to the axes. However, in a later loop step line1, = plts.plot() creates another Line2D object and puts it to the canvas. The initial Line2D object is not changed and it doesn't know that there is now some other line next to it in the plot. Therefore, if you don't remove the lines they will all be visible in the static plot at the end.

Display multiple output tables in IPython notebook using Pandas

I now know that I can output multiple charts from IPython pandas by embedding them in one plot space which will appear in a single output cell in the notebook.
Can I do something similar with Pandas HTML Tables?
I am getting data from multiple tabs (about 15-20) on a spreadsheet and running them though a set of regressions and I'd like to display the results together, perhaps 2 up.. But since the function to display the table only displays one, the last one, not sure how to approach. Ideas?
I'd even be happy to display in successive output cells.. not real sure how to do that either But I guess I could do something really dirty calling each (spreadsheet) tab in a separate cell.. Ughh... FWIW I'm on IPython 2.0 dev and Pandas 13
This does it:
area-tabs=list(map(str, range(1, 28))) # for all 27 tabs
#area_tabs=['1','2'] # for specific tabs
for area_tabs in area_tabs:
actdf,aname = get_data(area_tabs) #get_data gets the data and does a bunch or regression and table building
aname,actdf,merged2,mergederrs,montdist,ols_test,mergedfcst=projections(actdf)
IPython.display.display('Area: %s' % aname, mergederrs.tail(12))
IPython.display.display('Area: %s' % aname, ols_test)
IPython.display.display('Area: %s' % aname, merged2)
SO I can print all the results for each tab on the spreadsheet

Aliasing when saving matplotlib filled contour plot to .pdf or .eps

I'm generating a filed contour plot with the matplotlib.pyplot.contourf() function. The arguments in the call to the function are:
contourf(xvec,xvec,w,levels,cmap=matplotlib.cm.jet)
where
xvec = numpy.linspace(-3.,3.,50)
levels = numpy.linspace(-0.01,0.25,100)
and w is my data.
The resulting plot looks pretty good on screen, but when I save to pdf using a call to matplotlib.pyplot.savefig(), the resulting pdf has a lot of aliasing (I think that is what it is) going on. The call to savefig is simply savefig('filename.pdf'). I have tried using the dpi argument, but without luck. A call to matplotlib.get_backend() spits out 'TkAgg'.
I will attach a figure saved as pdf, compared to a figure saved as png (similar to what it looks like on screen) to demonstrate the problem:
png wihtout aliasing: https://dl.dropbox.com/u/6042643/wigner_g0.17.png
pdf with aliasing: https://dl.dropbox.com/u/6042643/wigner_g0.17.pdf
Please let me know if there are any other details I could give to help you give an answer. I should mention that saving as .eps gives similar bad results as saving to pdf. But the pdf shows the problem even clearer. My goal is to end up with a production quality .eps that I can attach to a latex document to be published as a scientific paper. I would be happy with some kind of work around where I save in one format, then convert it, if I can find a way that gives satisfying results.
Best,
Arne
After using the useful answer by #pelson for a while, I finally found a proper solution to this long-standing problem (currently in Matplotlib 3), which does not require multiple calls to contour or rasterizing the figure.
I refer to my original answer here for a more extensive explanation and examples.
In summary, the solution consists of the following lines:
cnt = plt.contourf(x, y, z)
for c in cnt.collections:
c.set_edgecolor("face")
plt.savefig('test.pdf')
I had no idea that contouring in pdf was so bad. You're right, I think the contours are being anti-aliased by the PDF renderers outside of matplotlib. It is for this reason I think you need to be particularly careful which application you use to view the resulting PDF - the best behaviour I have seen is with GIMP, but I'm sure there are plenty of other viewers which perform well.
To fix this problem (when viewing the PDF with GIMP), I was able to "rasterize" the contours produced with matplotlib to avoid the ugly white line problem:
import matplotlib.pyplot as plt
import numpy as np
xs, ys = np.mgrid[0:30, 0:40]
data = (xs - 15) ** 2 + (ys - 20) ** 2 + (np.sin(ys) + 10) ** 2
cs = plt.contourf(xs, ys, data, 60, cmap='jet')
# Rasterize the contour collections
for c in cs.collections:
c.set_rasterized(True)
plt.savefig('test.pdf')
This produced a contour plot which did not exhibit the problems you've shown.
Another alternative, perhaps better, approach, would be to fool the anti-aliasing by putting coloured lines below the contourf.
import matplotlib.pyplot as plt
import numpy as np
xs, ys = np.mgrid[0:30, 0:40]
data = (xs - 15) ** 2 + (ys - 20) ** 2 + (np.sin(ys) + 10) ** 2
# contour the plot first to remove any AA artifacts
plt.contour(xs, ys, data, 60, cmap='jet', lw=0.1)
cs = plt.contourf(xs, ys, data, 60, cmap='jet')
plt.savefig('test.pdf')
I should note that I don't see these problems if I save the figure as a ".ps" rather than a ".pdf" - perhaps that is a third alternative.
Hope this helps you get the paper looking exactly how you want it.

eps cmyk color matplotlib output

I need to output my plots in EPS with the CMYK color space. Unfortunately this particular format is requested by the journal I am submitting my work to!
This discussion was the only one I could find that has addressed the issue but it is more than 2 years old. I was hoping there might be some updates fixing the problem by now.
All my programming is in Python3 and so far I have been saving my plots in PDF which had no problem. But now that I want to plot EPS there is a problem. For example the code bellow prints the simple plot in .png and .pdf but the .eps output is totally blank!
import numpy as np
import matplotlib.pyplot as plt
X=[1,2,3]
Y=[4,5,6]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(X,Y)
fig.savefig('test.eps')
fig.savefig('test.pdf')
fig.savefig('test.png')
So I have two questions:
How can I fix the eps output?
How can I set the eps output color space to CMYK?
Thanks in advance.
I too have the same problem. One workaround I have found is to save plots as .svg and then use a program like Inkscape to convert to eps. I used to be able to save in .eps without any issues and then lost the ability after an update.
Update I was able to solve this problem for my specific setup by changing a few lines in my .matplotlibrc, so I will post the relevant lines here in the hope that it may be helpful to you as well. Note this requires that you have xpdf and ghostscript already installed.
For me the important one was
##Saving Figures
ps.usedistiller : xpdf
But I also have
path.simplify : True
savefig.format : eps
Now I am able to save directly to .eps and include them in LaTeX'ed journal articles...