I create PDF documents from Markdown documents using the simplest pandoc command:
pandoc my.md -o my.pdf
The figures inside the PDF are all stretched, i.e: 100% width.
Which configuration should I give to pandoc to leave the figures as is without changing figure size.
Currently you cannot control that feature directly from Markdown.
In recent months there have been some discussions going on in the Pandoc developer + user community about how to best implement it and create an easy-to-use syntax, for example
![Image Caption](./path/to/image.jpg "Image Comment"){width="60%", height="150px"}
(Warning: Example only, made up on the fly and drawn out of thin air by myself -- can't remember the latest state of the discussion...) This is designed to then transfer to all the supported output formats which can contain images, not just PDF.
So this is planned to be a major new feature for the next major release of Pandoc.
As you may or may not know, Pandoc doesn't create the PDFs itself. It produces LaTeX and employs LaTeX technology (by default its pdflatex command) to convert the LaTeX to PDF (then deleting the intermediate LaTeX files).
To execute some (limited) control about how the LaTeX/PDF pages (or other outputs) look like, Pandoc uses template files. You can look at the exact template definitions your own Pandoc version uses for LaTeX/PDF output by running
pandoc -D latex
So if you are a LaTeX hacker (or know one), you are able to modify that or create your own template from scratch.
In the current release of Pandoc (v1.13.2.1), there is this code snippet in the LaTeX template:
\makeatletter
\def\maxwidth{\ifdim\Gin#nat#width>\linewidth\linewidth\else\Gin#nat#width\fi}
\def\maxheight{\ifdim\Gin#nat#height>\textheight\textheight\else\Gin#nat#height\fi}
\makeatother
% Scale images if necessary, so that they will not overflow the page
% margins by default, and it is still possible to overwrite the defaults
% using explicit options in \includegraphics[width, height, ...]{}
\setkeys{Gin}{width=\maxwidth,height=\maxheight,keepaspectratio}
This should keep the original image sizes if they fit into the page width, and scale them down to the page width if they don't.
If this is not the behavior you experience with your PDF output, I suspect you are an a rather old version of Pandoc.
For using your own template instead of the builtin internal one, you can add
--template=/path/to/myown-template.latex
to the Pandoc command line.
#KurtPfeifle Thanks for your help. I updated the latex to set static width and hight for the images using the tip.
In my latex template I have:
\setkeys{Gin}{width=128pt,height=192pt,keepaspectratio}
This works great for the mobile images. But I also have a cover page, where the cover figure is now small sized.
I tried creating 2 different latex files and combining them but the figure sizes are back to being stretched:
pandoc _cover_page.md -o _cover_page.tex
pandoc ... -template=mobile_images.latex -o remaining.tex
pandoc _cover_page.tex remaining.tex -o out.pdf
Is there an easy way to combine latex files whicih obey the templates in Pandoc?
I can create 2 pdf files: cover.pdf and remaining.pdf, and combine them too. Is there an easy tool that you know?
Related
Problem
My notebook is solely Markdown and I would like to export it to a PDF with the same Markdown rendering that JupyterLab displays. However, the regular PDF export converts it to LaTex and then to a PDF and it looks nothing like how I want it formatted. I would rather not have to manually edit a Tex file every time I want to export a notebook to a PDF, especially since it is very time-consuming for large files.
Exporting to WebPDF looks much closer to the result I desire, however, the page size is all over the place and I would like it to be Letter size (8.5 x 11 inches).
Question
How can I control the page size on the WebPDF export?
Bonus Question
Is it possible to get the PDF to look the way it does on JupyterLab Markdown rendering, including the dark theme? (printing the page to PDF does a terrible job and makes all the text an image)
Okay, I am a little confused by the question, but I will do my best to answer this.
First, I would like to introduce you to pandoc. Pandoc is a document conversion system. This will let you control how your markdown is converted into a pdf or any other desired format that pandoc converts to. For additional formatting control, pandoc has support for templates. Which will allow you to customize exactly how that document is treated on export.
Now to address your page size question. I do not think that you can control this from markdown alone, however you can if you use pandoc. This can be done by adding some LaTeX code into your markdown file. You can find the information on how to control page size using LaTeX here. Once you add this LaTeX code, you can convert to pdf using pandoc and a pandoc template. Pandoc provides a number of default templates which will work fine. Here is an example of the command used to do this conversion:
pandoc /filepath/doc_name.md -o doc_name.pdf --template /file_path/pandoc-templates/default.latex
Bonus question:
You can make a custom pandoc template to replicate any formatting and rendering that is done in JupyterLab Markdown. I am not too familiar with JuypterLabs, but making pandoc templates is not too bad and pandoc provides great documentation available here.
The team I am on has dozens of markdown documents created using a Markdown editor called Typora (they won't want to switch to another editor). We would like to use pandoc to bulk convert the Typora markdown files to PDFs. This would be included as part of a Jenkins build job, so exporting from Typora's GUI to PDF does not work.
Unfortunately, the PDF output has issues. Namely:
Typora uses github flavored markdown which uses pipe tables. Pandoc does not autowrap the table entries causing the text to overflow off the right side of the PDF document.
Code blocks fail to wrap. Though, I think I can solved this using the listings package.
Here is the pandoc command I am trying to use:
pandoc --standalone --from=gfm+pipe_tables --to=pdf -V geometry:margin=1in --shift-heading-level-by=-1 --resource-path=.:images:jenkins --table-of-contents intputfile.md --output=outputfile.pdf
Based on my research, there doesn't seem to be an easy way to correctly convert Typora's markdown to PDF unless I use a pandoc filter or change pandoc's default latex template. Does that sound right?
Disclaimer: I am new to latex and pandoc, so I hope my question makes sense. I appreciate any help.
when I convert pdf to image in linux command line, it seems inkscape gets the best result (better quality than gs with same dpi). Unfortunately, it only converts the first page to png. How to convert every pdf page to different png file? Do I have to extract one PDF page and store to a new pdf file , then do inkscape concert, and so on?
This isn't solely using Inkscape, but you could use e.g. pdftk to split up the pdf-file into separate pages and convert every page into a png with Inkscape. For example, like this:
pdftk file.pdf burst;
l=$(ls pg_*.pdf)
for i in $l; do inkscape "$i" -z --export-dpi=300 --export-area-page --export-png="$i.png"; done
Note that pdftk burst creates pdf-files called pg_0001.pdf, etc., so if you have any files named like that, they'll be overwritten. You can remove them afterwards easily using
rm pg_*.pdf
Lu Kas' answer threw warnings for me without doing the conversion. Probably because I'm running Inkscape 1.1
However, i got it running by replacing some deprecated commands:
inkscape pdfFile.pdf --export-dpi=300 --export-area-page --export-filename=imageFile.png;
For batch processing rather than slowly looping through file by file inkscape has a shell mode for command file scripting. See https://wiki.inkscape.org/wiki/index.php/Using_the_Command_Line#Shell_mode
However like all other #file.txt scripts you need to write a custom text file. and for Windows users run against higher ranking inkscape.com not .exe
Since version 1.0 (currently 1.2) a multipage pdf of contents can be addressed for multiple outputs. for some other examples see https://inkscape.org/doc/inkscape-man.html#EXAMPLES
Commands get replaced over time so currently to export png use --export-type="xxx" to batch export a list of input files to type xxx. Thus in this case --export-type="png"
Also for pdf related inputs and support see https://wiki.inkscape.org/wiki/index.php/Using_the_Command_Line#New_options
For windows users there is a handy batchfile converter here https://gist.github.com/JohannesDeml/779b29128cdd7f216ab5000466404f11
I have approximately 20 files in markdown type and I need to convert those into one pdf document. I try using online converter, but the images are not showing, it just like ![alt text](image.png)
Using Calibre app also not showing images.
Btw, I am using Gitbook to generate my markdown and html view, I read the documentation about how to convert into pdf using gitbook pdfin command line, but it returns TypeError
Does anyone know how to solve this? I am using Windows 10
Hi you can use the Pandoc tool (it runs on Windows/MacOS/Linux).
It is an command line tool which can easily convert your Markdown file into PDF (or other kind of format).
Take a look to Pandoc website
Maybe you will have to install a LaTeX environnement like Miktex in order to convet into PDF.
An example from Pandoc documentation :
From markdown to PDF:
pandoc myInput.md --latex-engine=xelatex -o myOutput.pdf
Furthermore, there is several interesting options if you want to generate a table of contents in your output for instance.
Normal plots generated by R chunks in R markdown files are exactly there when converted to html slides or pdf. However, when they are converted to beamer slides by pandoc -t beamer ex.md -V theme:Warsaw -o beamer.pdf
, the plots become extremely large, especially for those generated by par(mfrow=c(n,m)), in which case only a little part of the plot is displayed.
I tried to fix by setting the chunk option dev='pdf', but it doesn't work out.
The plot in html is
The plot in beamer is
The development version of pandoc includes some code in the beamer template that should scale images to the width of the slide. That may help in your case.
You don't need to install development pandoc to use this, since the change is just to a template. Just generate a copy of the default beamer template using pandoc -D beamer > my.beamer. Insert the following lines into my.beamer after the line \usepackage{graphicx}:
\makeatletter
\def\ScaleIfNeeded{%
\ifdim\Gin#nat#width>\linewidth
\linewidth
\else
\Gin#nat#width
\fi
}
\makeatother
\setkeys{Gin}{width=\ScaleIfNeeded}
Then use pandoc with the option --template=my.beamer.