knitr pdflatex output discrepancy between console and PDF - pdf

I am running RStudio Version 0.98.484 and R version 3.0.2 on OS X Mavericks.
While using knitr I noticed a discrepancy between the console output from a summary() command and that generated in the PDF (via pdflatex). Here is the example.
\documentclass[11pt]{article}
\usepackage{MinionPro}
\usepackage{MnSymbol}
\usepackage[margin = 1 in]{geometry}
\geometry{verbose,tmargin=2.5cm,bmargin=2.5cm,lmargin=2.5cm,rmargin=2.5cm}
\setcounter{secnumdepth}{2}
\setcounter{tocdepth}{2}
\usepackage{url}
\usepackage[unicode=true,pdfusetitle,
bookmarks=true,bookmarksnumbered=true,bookmarksopen=true,bookmarksopenlevel=2,
breaklinks=false,pdfborder={0 0 1},backref=false,colorlinks=false]
{hyperref}
\hypersetup{
pdfstartview={XYZ null null 1}}
\usepackage{breakurl}
\usepackage{color}
\usepackage{graphicx}
\usepackage{fancyhdr}
\definecolor{darkred}{rgb}{0.5,0,0}
\definecolor{darkgreen}{rgb}{0,0.5,0}
\definecolor{darkblue}{rgb}{0,0,0.5}
\hypersetup{ colorlinks,
linkcolor=darkblue,
filecolor=darkgreen,
urlcolor=darkred,
citecolor=darkblue }
\definecolor{keywordcolor}{rgb}{0,0.6,0.6}
\definecolor{delimcolor}{rgb}{0.461,0.039,0.102}
\definecolor{Rcommentcolor}{rgb}{0.101,0.043,0.432}
\usepackage{booktabs}
\usepackage{listings}
\lstset{breaklines=true,showstringspaces=false}
\makeatletter
\newcommand\gobblepars{%
\#ifnextchar\par%
{\expandafter\gobblepars\#gobble}%
{}}
\makeatother
\newcommand{\R}{R}
\title{\textsc{Laboratory Session 1}}
\author{Ani}
\begin{document}
<<setup, include=FALSE, cache=FALSE>>=
library(knitr)
# set global chunk options
opts_chunk$set(fig.path='figure/minimal-', fig.align='center', fig.show='hold')
options(replace.assign=FALSE, width=90, tidy=TRUE)
render_listings()
#
\maketitle
<<chunk26>>==
require(rpart)
data(car90)
summary(car90$Price)
#
Hello!
\end{document}
The console shows:
> summary(car90$Price)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
5866 9995 13070 15810 19940 41990 6
the pdf shows
Min. 1st Qu. Median Mean 3rd Qu. Max. NA 's
5870 10000 13100 15800 19900 42000 6
Why would this be happening? There are no decimals to round up. Any clues would be much appreciated.
Thanks!!
Ani

That is because knitr sets options(digits=4), and the default value of the digits option is 7 in R. You can reproduce it by
library(rpart)
options(digits=4)
summary(car90$Price)

Related

Matplotlib problems with linewidths

I noticed in doing some line plots that Matplotlib exhibits strange behaviour (using Python 3.7 and the default TKAgg backend). I've created a program plotting lines of various widths to show the problem. The program creates a bunch of financial looking data and then runs through a loop showing line plots of various linewidths. At the beginning of each loop it asks the user to input the linewidth they would like to see. Just enter 0 to end the program.
import numpy as np
import matplotlib as mpl
from matplotlib.lines import Line2D
import matplotlib.pyplot as plt
# Initialize prices and arrays
initial_price = 35.24
quart_hour_prices = np.empty(32) # 32 15 min periods per days
day_prices = np.empty([100,2]) # 100 days [high,low]
quart_hour_prices[0] = initial_price
# Create Data
for day in range(100):
for t in range(1, 32):
quart_hour_prices[t] = quart_hour_prices[t-1] + np.random.normal(0, .2) # 0.2 stand dev in 15 min
day_prices[day,0] = quart_hour_prices.max()
day_prices[day,1] = quart_hour_prices.min()
quart_hour_prices[0] = quart_hour_prices[31]
# Setup Plot
fig, ax = plt.subplots()
# Loop through plots of various linewidths
while True:
lw = float(input("Enter linewidth:")) # input linewidth
if lw == 0: # enter 0 to exit program
exit()
plt.cla() # clear plot before adding new lines
plt.title("Linewidth is: " + str(round(lw,2)) + " points")
# loop through data to create lines on plot
for d in range(100):
high = day_prices[d,1]
low = day_prices[d,0]
hl_bar = Line2D(xdata=(d, d), ydata=(high, low), color='k', linewidth=lw, antialiased=False)
ax.add_line(hl_bar)
ax.autoscale_view()
plt.show(block=False)
Matplotlib defines linewidths in points and its default is to have 72ppi. It also uses a default of 100dpi. So this means each point of linewidth takes up .72 dots or pixels. Thus I would expect to see linewidths less than 0.72 to be one pixel wide, those from 0.72 - 1.44 to be two pixels wide, and so on. But this is not what was observed.
A 0.72 linewidth did indeed give me a line that was one pixel wide. And then when the linewidth is increased to 0.73 the line gets thicker as expected. But it is now three pixels wide, instead of the two I expected.
For linewidths less than 0.72 the plot remains the same all the way down to 0.36. But then when I enter a linewidth of 0.35 or less, the line suddenly gets thicker (2 pixels wide), as shown by the graph below. How can the line get thicker if I reduce the linewidth? This was very unexpected.
Continuing the same testing process for greater linewidths, the plot of the 0.73 linewidth remains the same all the way up until a width of 1.07. But then at 1.08 the linewidth mysteriously gets thinner (2 pixels wide) being the same as the 0.35 and below plots. How can the line get thinner if I increase the linewidth? This was also very unexpected.
This strange behavior continues with greater linewidths. Feel free to use the above code to try it for yourself. Here is a table to summarize the results :
Points Linewidth in pixels
0.01 - 0.35 2
0.36 - 0.72 1
0.73 - 1.07 3
1.08 - 1.44 2
1.45 - 1.79 4
1.80 - 2.16 3
2.17 - 2.51 5
2.52 - 2.88 4
The pattern is something like 1 step back, 2 steps forward. Does anyone know why Matplotlib produces these results?
The practical purpose behind this question is that I am trying to produce an algorithm to vary the linewidth depending upon the density of the data in the plot. But this is very difficult to do when the line thicknesses are jumping around in such a strange fashion.

Combining information of tex and eps file generated via gnuplot to a single figure file?

I use gnuplot with epslatex option to generate figure files for plotting purposes (like here). Via this method you get 2 files corresponding to same figure, one tex file and one eps file. The figure information is in eps file and font information is in tex file. So my question is this :
Can I combine both font information and figure content to a single file like pdf / eps file ?
UPDATE : OK I forgot to mention one thing. Off course set terminal postscript eps will give me eps outputs, but it will not embed latex symbols in the plot as labels etc.
So I found a method which I got from Christoph's comment. Set terminal like set terminal epslatex 8 standalone and then finally after plotting do something like below:
set terminal epslatex color standalone
set output "file.tex"
set xrange [1:500]
set ylabel "Variance (\\AA\\textsuperscript{2})" # angstoms
set mxtics 4
plot "version1.dat" using 1:3 with linespoints pointinterval -5 pt 10 lt 1 lw 3 title 'label1' , \
"version1.dat" using 1:2 with linespoints pointinterval -5 pt 6 lt -1 lw 3 title 'label2';
unset output
# And now the important part (combine info to single file) :
set output # finish the current output file
system('latex file.tex && dvips file.dvi && ps2pdf file.ps')
system('mv file.ps file.eps')
unset terminal
reset
These steps do output tex file which is converted to dvi and ps file. And finally you rename the postscript file to eps. Now you have figure information and tex symbol information in single file. This eps file is accepted by latex files.
OK now why this works : Sorry I don't know the entire technical details. But this is working fine with me.

Reading two column data from a text file and plot using tikz package

I want to export my data in two columns (x and y) and plot them using tikz package in LaTeX. Is there a way to it from a large number of rows? A code snippet would be sufficient.
Take a look at the pgfplots package. Here a simple example:
\documentclass[border=10pt]{standalone}
\usepackage{pgfplots}
\usepackage{filecontents}
% The content of the data file "data.dat".
% Delete this block if you have your data in an external file.
\begin{filecontents}{data.dat}
x y
0 0.2
1 3.3
2 1.5
3 1.1
4 2.5
\end{filecontents}
\begin{document}
\begin{tikzpicture}
\begin{axis}
\addplot table {data.dat};
\end{axis}
\end{tikzpicture}
\end{document}
The result:

plotting pdf in gnuplot : error Cannot open load file 'stat.inc'

I am learning how to plot pdf in gnuplot. The code is sourced from
http://gnuplot.sourceforge.net/demo/random.html
the code is
unset contour
unset parametric
load "stat.inc"
print ""
print "Simple Monte Carlo simulation"
print ""
print "The first curve is a histogram where the binned frequency of occurence"
print "of a pseudo random variable distributed according to the normal"
print "(Gaussian) law is scaled such that the histogram converges to the"
print "normal probability density function with increasing number of samples"
print "used in the Monte Carlo simulation. The second curve is the normal"
print "probability density function with unit variance and zero mean."
print ""
nsamp = 5000
binwidth = 20
xlow = -3.0
xhigh = 3.0
scale = (binwidth/(xhigh-xlow))
# Generate N random data points.
set print "random.tmp"
do for [i=1:nsamp] {
print sprintf("%8.5g %8.5g", invnorm(rand(0)), (1.0*scale/nsamp))
}
unset print
#
set samples 200
tstring(n) = sprintf("Histogram of %d random samples from a univariate\nGaussian PDF with unit variance and zero mean", n)
set title tstring(nsamp)
set key
set grid
set terminal png
set output "x.png"
set xrange [-3:3]
set yrange [0:0.45]
bin(x) = (1.0/scale)*floor(x*scale)
plot "random.tmp" using (bin($1)):2 smooth frequency with steps title "scaled bin frequency", normal(x,0,1) with lines title "Gaussian p.d.f."
I get the error
Cannot open load file 'stat.inc'
"stat.inc", line 4: util.c: No such file or directory
The version of gnuplot I am using is gnuplot 4.6 patchlevel 0.
Please help as I am unable to locate this particular file or run the code on ubuntu platform.
you find all that file in the source release of gnuplot.
You can download it from sourceforge
you'll find the stat.inc in the demo directory

Some inline plots won't print from Ipython noteboook

I am using Ipython Notebook to generate some bar plots.
The code cell is this:
kcount =0;for k, v in pledge.groupby(['Date','Break']).sum().Amount.iteritems():
if k[0] <> kcount:
kcount=k[0]
pledge[pledge.Date==k[0]].groupby(['Break','Progcode'])['Amount'].sum().plot(kind='bar')
plt.title(k[0])
plt.figure()
This gives me a bar plot for every day of our pledge drive, showing how each show within that day did. 24 charts in all. They display great as output on the screen, but when I use the Print button in Ipython Notebook, it only prints enough graphs to fill the last page, which can vary from 3 to 6 graphs depending on the printer used. One printer used reported that it required 11x17 paper for the print job (not something I set anywhere) and when I manually set it for 8 1/2 x 11, it again only printed out the first 3 pages. I am at a loss as to what to do at this point.
As a workaround, can you can use plt.savefig('filename.png') (or .jpg, or .whatever) to save an image file and then print the files out manually?
I ended up saving these pages to a multipage PDF file and then printing them from there.
Consult the docs http://matplotlib.org/api/backend_pdf_api.html
To see how to save several figures to a multipage PDF file.
This also looks like a good resource. http://blog.marmakoide.org/?p=94