Export Jupyter Notebook to PDF With Same Markdown Rendering - pdf

Problem
My notebook is solely Markdown and I would like to export it to a PDF with the same Markdown rendering that JupyterLab displays. However, the regular PDF export converts it to LaTex and then to a PDF and it looks nothing like how I want it formatted. I would rather not have to manually edit a Tex file every time I want to export a notebook to a PDF, especially since it is very time-consuming for large files.
Exporting to WebPDF looks much closer to the result I desire, however, the page size is all over the place and I would like it to be Letter size (8.5 x 11 inches).
Question
How can I control the page size on the WebPDF export?
Bonus Question
Is it possible to get the PDF to look the way it does on JupyterLab Markdown rendering, including the dark theme? (printing the page to PDF does a terrible job and makes all the text an image)

Okay, I am a little confused by the question, but I will do my best to answer this.
First, I would like to introduce you to pandoc. Pandoc is a document conversion system. This will let you control how your markdown is converted into a pdf or any other desired format that pandoc converts to. For additional formatting control, pandoc has support for templates. Which will allow you to customize exactly how that document is treated on export.
Now to address your page size question. I do not think that you can control this from markdown alone, however you can if you use pandoc. This can be done by adding some LaTeX code into your markdown file. You can find the information on how to control page size using LaTeX here. Once you add this LaTeX code, you can convert to pdf using pandoc and a pandoc template. Pandoc provides a number of default templates which will work fine. Here is an example of the command used to do this conversion:
pandoc /filepath/doc_name.md -o doc_name.pdf --template /file_path/pandoc-templates/default.latex
Bonus question:
You can make a custom pandoc template to replicate any formatting and rendering that is done in JupyterLab Markdown. I am not too familiar with JuypterLabs, but making pandoc templates is not too bad and pandoc provides great documentation available here.

Related

Recommended Way to Bulk Convert Typora Markdown to PDF using Pandoc

The team I am on has dozens of markdown documents created using a Markdown editor called Typora (they won't want to switch to another editor). We would like to use pandoc to bulk convert the Typora markdown files to PDFs. This would be included as part of a Jenkins build job, so exporting from Typora's GUI to PDF does not work.
Unfortunately, the PDF output has issues. Namely:
Typora uses github flavored markdown which uses pipe tables. Pandoc does not autowrap the table entries causing the text to overflow off the right side of the PDF document.
Code blocks fail to wrap. Though, I think I can solved this using the listings package.
Here is the pandoc command I am trying to use:
pandoc --standalone --from=gfm+pipe_tables --to=pdf -V geometry:margin=1in --shift-heading-level-by=-1 --resource-path=.:images:jenkins --table-of-contents intputfile.md --output=outputfile.pdf
Based on my research, there doesn't seem to be an easy way to correctly convert Typora's markdown to PDF unless I use a pandoc filter or change pandoc's default latex template. Does that sound right?
Disclaimer: I am new to latex and pandoc, so I hope my question makes sense. I appreciate any help.

inkscape: multiple page pdf to multiple png

when I convert pdf to image in linux command line, it seems inkscape gets the best result (better quality than gs with same dpi). Unfortunately, it only converts the first page to png. How to convert every pdf page to different png file? Do I have to extract one PDF page and store to a new pdf file , then do inkscape concert, and so on?
This isn't solely using Inkscape, but you could use e.g. pdftk to split up the pdf-file into separate pages and convert every page into a png with Inkscape. For example, like this:
pdftk file.pdf burst;
l=$(ls pg_*.pdf)
for i in $l; do inkscape "$i" -z --export-dpi=300 --export-area-page --export-png="$i.png"; done
Note that pdftk burst creates pdf-files called pg_0001.pdf, etc., so if you have any files named like that, they'll be overwritten. You can remove them afterwards easily using
rm pg_*.pdf
Lu Kas' answer threw warnings for me without doing the conversion. Probably because I'm running Inkscape 1.1
However, i got it running by replacing some deprecated commands:
inkscape pdfFile.pdf --export-dpi=300 --export-area-page --export-filename=imageFile.png;
For batch processing rather than slowly looping through file by file inkscape has a shell mode for command file scripting. See https://wiki.inkscape.org/wiki/index.php/Using_the_Command_Line#Shell_mode
However like all other #file.txt scripts you need to write a custom text file. and for Windows users run against higher ranking inkscape.com not .exe
Since version 1.0 (currently 1.2) a multipage pdf of contents can be addressed for multiple outputs. for some other examples see https://inkscape.org/doc/inkscape-man.html#EXAMPLES
Commands get replaced over time so currently to export png use --export-type="xxx" to batch export a list of input files to type xxx. Thus in this case --export-type="png"
Also for pdf related inputs and support see https://wiki.inkscape.org/wiki/index.php/Using_the_Command_Line#New_options
For windows users there is a handy batchfile converter here https://gist.github.com/JohannesDeml/779b29128cdd7f216ab5000466404f11

some markdown files into one pdf document

I have approximately 20 files in markdown type and I need to convert those into one pdf document. I try using online converter, but the images are not showing, it just like ![alt text](image.png)
Using Calibre app also not showing images.
Btw, I am using Gitbook to generate my markdown and html view, I read the documentation about how to convert into pdf using gitbook pdfin command line, but it returns TypeError
Does anyone know how to solve this? I am using Windows 10
Hi you can use the Pandoc tool (it runs on Windows/MacOS/Linux).
It is an command line tool which can easily convert your Markdown file into PDF (or other kind of format).
Take a look to Pandoc website
Maybe you will have to install a LaTeX environnement like Miktex in order to convet into PDF.
An example from Pandoc documentation :
From markdown to PDF:
pandoc myInput.md --latex-engine=xelatex -o myOutput.pdf
Furthermore, there is several interesting options if you want to generate a table of contents in your output for instance.

IPython/ Jupyter download as PDF styling

Imagine editing a typical IPython (4.x) notebook, notebook.ipynb, in the Jupyter editor. The code, graphs, and markdown get rendered exactly how you like them when previewed in the browser.
But then you "Download as PDF via LaTeX" and get something slightly different:
A centered title/ date header has been added.
The font is now serif instead of sans serif.
Section headers are numbered.
I'd like to change the default output to be a little more "what you see is what you get". In particular: I don't want a title header; I don't want numbering on my section headers; and I want sans serif font (code blocks look better with sans IMHO). How can I do this using the LaTeX custom template.tplx files and/ or the jupyter_nbconvert_config.py configuration?
I don't mind having to use the jupyter nbconvert command, but my first choice would be a one-click solution from the browser.
Thanks!
You can run the following on your notebook file from the command line (in the same directory):
ipython nbconvert --to latex notebook.ipynb
This will generate a tex file, which you can then open with a latex editor such as Texmaker. There you can edit the latex code to conform to any style you want (i.e. changing font, changing margins, changing numbering, etc.). Finally, convert the tex to pdf (most latex editors have tools for this).
Of course, this isn't an automated solution, but it allows for detailed changes and customization, so your final pdf comes out exactly as you want.
What you are looking for is to use a different latex template.
See this post for more details.
Changing style of PDF-Latex output through IPython Notebook conversion
Basically, you will need to edit your tplx files in your /nbconvert/templates/latex directory.
I'm still learning latex, but I did manage to change my default font for my documents to San-Serif by using adding this \renewcommand{\familydefault}{\sfdefault} to my article.tplx file.
Like so:
((* block docclass *))
\renewcommand{\familydefault}{\sfdefault}
\documentclass{article}
((* endblock docclass *))

Pandoc disable figure stretching from Markdown to PDF conversion

I create PDF documents from Markdown documents using the simplest pandoc command:
pandoc my.md -o my.pdf
The figures inside the PDF are all stretched, i.e: 100% width.
Which configuration should I give to pandoc to leave the figures as is without changing figure size.
Currently you cannot control that feature directly from Markdown.
In recent months there have been some discussions going on in the Pandoc developer + user community about how to best implement it and create an easy-to-use syntax, for example
![Image Caption](./path/to/image.jpg "Image Comment"){width="60%", height="150px"}
(Warning: Example only, made up on the fly and drawn out of thin air by myself -- can't remember the latest state of the discussion...) This is designed to then transfer to all the supported output formats which can contain images, not just PDF.
So this is planned to be a major new feature for the next major release of Pandoc.
As you may or may not know, Pandoc doesn't create the PDFs itself. It produces LaTeX and employs LaTeX technology (by default its pdflatex command) to convert the LaTeX to PDF (then deleting the intermediate LaTeX files).
To execute some (limited) control about how the LaTeX/PDF pages (or other outputs) look like, Pandoc uses template files. You can look at the exact template definitions your own Pandoc version uses for LaTeX/PDF output by running
pandoc -D latex
So if you are a LaTeX hacker (or know one), you are able to modify that or create your own template from scratch.
In the current release of Pandoc (v1.13.2.1), there is this code snippet in the LaTeX template:
\makeatletter
\def\maxwidth{\ifdim\Gin#nat#width>\linewidth\linewidth\else\Gin#nat#width\fi}
\def\maxheight{\ifdim\Gin#nat#height>\textheight\textheight\else\Gin#nat#height\fi}
\makeatother
% Scale images if necessary, so that they will not overflow the page
% margins by default, and it is still possible to overwrite the defaults
% using explicit options in \includegraphics[width, height, ...]{}
\setkeys{Gin}{width=\maxwidth,height=\maxheight,keepaspectratio}
This should keep the original image sizes if they fit into the page width, and scale them down to the page width if they don't.
If this is not the behavior you experience with your PDF output, I suspect you are an a rather old version of Pandoc.
For using your own template instead of the builtin internal one, you can add
--template=/path/to/myown-template.latex
to the Pandoc command line.
#KurtPfeifle Thanks for your help. I updated the latex to set static width and hight for the images using the tip.
In my latex template I have:
\setkeys{Gin}{width=128pt,height=192pt,keepaspectratio}
This works great for the mobile images. But I also have a cover page, where the cover figure is now small sized.
I tried creating 2 different latex files and combining them but the figure sizes are back to being stretched:
pandoc _cover_page.md -o _cover_page.tex
pandoc ... -template=mobile_images.latex -o remaining.tex
pandoc _cover_page.tex remaining.tex -o out.pdf
Is there an easy way to combine latex files whicih obey the templates in Pandoc?
I can create 2 pdf files: cover.pdf and remaining.pdf, and combine them too. Is there an easy tool that you know?