Struggling with PDF output of bookdown - pdf

I thought it would be a good idea to write a longer report/protocol using bookdown since it's more comfortable to have one file per topic to write in instead of just one RMarkdown document with everything. Now I'm faced with the problem of sharing this document - the HTML looks best (except for wide tables being cut off) but is difficult to send via e-mail to a supervisor for example. I also can't expect anyone to be able to open the ePub format on their computer, so PDF would be the easiest choice. Now my problems:
My chapter headings are pretty long, which doesn't matter in HTML but they don't fit the page headers in the PDF document. In LaTeX I could define a short title for that, can I do that in bookdown as well?
I include figure files using knitr::include_graphics() inside of code chunks, so I generate the caption via the chunk options. For some figures, I can't avoid having an underscore in the caption, but that does not work out in LaTeX. Is there a way to escape the underscore that actually works (preferrably for HTML and PDF at the same time)? My LaTeX output looks like this after rendering:
\textbackslash{}begin\{figure\}
\includegraphics[width=0.6\linewidth,height=0.6\textheight]{figures/0165_HMMER} \textbackslash{}caption\{Output of HMMER for PA\_0165\}\label{fig:0165}
\textbackslash{}end\{figure\}
Edit
MWE showing that the problem is an underscore in combination with out.height (or width) in percent:
---
title: "MWE FigCap"
author: "LilithElina"
date: "19 Februar 2020"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
```{r cars}
summary(cars)
```
## Including Plots
You can also embed plots, for example:
```{r pressure, echo=FALSE, fig.cap="This is a nice figure caption", out.height='40%'}
plot(pressure)
```
Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
```{r pressure2, echo=FALSE, fig.cap="This is a not nice figure_caption", out.height='40%'}
plot(pressure)
```

Concerning shorter headings: pandoc, which is used for the markdown to LaTeX conversion, does not offer a "shorter heading". You can do that yourself, though:
# Really long chaper heading
\markboth{\thechapter~short heading}{}
[...]
## Really long section heading
\markright{\thesection~short heading}
This assumes a document class with chapters and sections.
Concerning the underscore in the figure caption: For me it works for both PDF and HTML to escape the underscore:
```{r pressure2, echo=FALSE, fig.cap="This is a not nice figure\\_caption", out.height='40%'}
plot(pressure)
```

Related

Set itemize spacing in Quarto / RMarkdown PDF

In LaTeX, it's easy to control the vertical spacing in an itemized list by using \setlength\itemsep{1em}:
\begin{itemize}
\setlength\itemsep{1em}
\item one
\item two
\item three
\end{itemize}
How would I control this in Quarto / RMarkdown? If I write an itemized list using just markdown, I can't control the spacing, e.g.:
1. one
2. two
3. three
Is there a way to set a "global" spacing setting for all itemized or bullet lists? That would work.
Otherwise, is there a way to set the vertical spacing for a markdown list?
Edit:
This is the header I'm using for a Quarto PDF document (I didn't realize the solutions might be sensitive to the document class):
---
format:
pdf:
documentclass: scrartcl
papersize: letter
pdf-engine: xelatex
geometry:
- margin=1in
- heightrounded
include-in-header:
- preamble.tex
---
Assuming your are using the quarto pdf format, you could do something like this:
---
format: pdf
header-includes:
- \apptocmd{\tightlist}{\setlength{\itemsep}{10pt}}{}{}
---
1. one
2. two
3. three

Using rmarkdown to create headers from a large .txt input

I'm trying to create a PDF with headers from rmarkdown. I'm reading in a large text file, and I want to print out this PDF using headers. I can easily read in and print to the PDF with desired formatting, minus the headers, using
```{r comment='', echo=FALSE}
cat(readLines("blah.txt", encoding="UTF-8"), sep="/n")
```
However, I can't get rmarkdown to evaluate '#' in my text, which creates headers. I've inserted the '#' into different sections of the .txt file where I want to create a header, but it doesn't evaluate the hashtag.
Does anyone know how to get rmarkdown to evaluate the '#' as a header without messing up the formatting of the text file as I already have it?
RMarkdown recognizes # ...-headers only if there is an empty line before. So what you need is simply to use one more line-break \n (note the back-slashes).
```{r comment='', echo=FALSE, results='asis'}
cat(readLines("blah.txt", encoding="UTF-8"), sep="\n\n")
```
Note, that you may want to add results='asis' in the R-junk.

Knitr Spin and Rmarkdown Fig.cap (figure caption). Producing double numbering pdf document

I am referring to this Suppress automatic figure numbering in pdf output with r markdown/knitr
which I don't think was answered fully.
Essentially, I am using knitr::spin and rmarkdown to produce word, pdf and html documents.
For word, there appears to be no numbering when one puts in
+fig.1, fig.cap = "Figure name"
You only get an output Figure name in the caption.
To solve that, I used captioner class.
figs = captioner("Figure")
That works fine for word
But I am not faced with rewriting the script for pdf document as the caption turns up as figure 1: figure 1: The name
I am using knitr::spin to actually generate the RMD document for forward outputs in word and pdf.
I am not sure I can use hooks in knitr::spin, as I have tried it as advertised but can't get it to work.
I also tried
header-includes: \usepackage{caption} \usepackage{float}
\captionsetup[table]{labelformat=empty}
\captionsetup[figure]{labelformat=empty}
as suggested somehere to surpress the prefix for pdf but I get errors from pandoc. It uses pdf2latex.
I am not sure how one would query the output format in knitr::spin to actually produce different actions for different formats which could be a solution although cumbersome.
Thank you so much for your help from a novice.

R markdown to PDF preserve (leading) whitespace

I have an R markdown file that I want to convert to PDF using knitr (or sweave).
For example:
---
output: pdf_document
---
```{python}
for x in range(3):
for y in range(3):
print(x+y)
```
If I knit to PDF and copy paste the for loop part back to a text editor, the tabs are gone. This is certainly expected behaviour considering how anti whitespace-preservation markdown and pdf are, but can I still somehow preserve the actual whitespace characters when knitting from R markdown to PDF?
It is not possible, there's nothing that can be done other than making sure the document looks right, i.e. visually the whitespace is there.
It comes down to what the PDF format actually is, read the accepted answer to this question - https://superuser.com/questions/198392/how-to-copy-text-out-of-a-pdf-without-losing-formatting

Citation formatting and the hyperref package

I'm using the hyperref package in my document. One of the things that it does is create bookmarks in my pdf, based on the table of contents. Some section titles contain a reference to a citation
\section{Some title \citep{BibTeXkey}}
The label of the bookmark then looks like
Some title BibTeXkey
But I would like it to be
Some title (Author, year)
Just like it is displayed in the text and the table of contents. So only the bookmarks are messed up.
I used the sequence pdflatex, bibtex, pdflatex, pdflatex to compile the document.
How do I change the bookmark label to use the same format as in the table of contents?
Whenever I'm having an issue with the pdf bookmarks not working properly, the solution is usually to use \texorpdfstring. It allows you to make a section title contain some non-text material (like a link or some symbols) and specify what should appear in the pdf bookmark, which cannot contain symbols. The input
\section{The section with \texorpdfstring{LaTeX symbols}{plain text version}}
produces the section title "The section with LaTeX symbols", but the pdf bookmark for the section is "The section with plain text version".
In your case, the easiest thing to do is probably
\section{Some title \texorpdfstring{\citep{BibTeXkey}}{(Author, year)}}
Unfortunately, this means that you have to paste "(Author, year)" in by hand, which is a little annoying, but not a big deal if your bibliography entry doesn't change (which is probably shouldn't) and you don't change your citation conventions.
If you really want to avoid having to type in "(Author, year)" by hand, you can try using the \show command to try to figure out how \citep produces it's output. But I warn you that this approach is not for the faint of heart: in this case, I think you'll end up looking through the aux file, not to mention the blg, brf, and bbl files.