Create Landscape PDF with Pandoc using wkhtmltopdf - pdf

I am trying to create PDF with landscape orientation in Pandoc.
I am using WkHtmlToPdf as a PDF Engine. I chose not to use LaTeX. Here is the command I am using:
pandoc test.md -t html -o test.pdf
But it creates a portrait orientation. How can I create PDF in landscape mode?
Things I have tried without success
pandoc -V geometry:landscape test.md -t html -o test.pdf
pandoc -O landscape test.md -t html -o test.pdf
Please help.
Note: I do not want to use LaTeX as my PDF engine.

Please try this:
pandoc test.md -t html \
--pdf-engine-opt="-O" --pdf-engine-opt="Landscape" \
-o test.pdf
Pandoc doesn't know the -O option, it must be given to wkhtmltopdf. Use --pdf-engine-opts, once for each option or argument that you want pandoc to pass to the pdf engine.
The above was tested and succeeds on:
Ubuntu 20.04
pandoc 2.5
wkhtmltopdf 0.12.5

Related

How to create a PDF/A from command line with Libre Office Draw in headless mode?

LibreOffice Draw allows you to open a non PDF/A file and export this a PDF/A-1b or PDF/A-2b file.
The same is possible from the command line by calling on macOS
/Applications/LibreOffice.app/Contents/MacOS/soffice --headless \
--convert-to pdf:draw_pdf_Export \
--outdir ./pdfout \
./input-non-pdfa.pdf
or an a Linux simply
libreoffice --headless \
--convert-to pdf:draw_pdf_Export \
--outdir ./pdfout \
./input-non-pdfa.pdf
On the command line it is possible to tell the convert-to to create a pdf and use LibreOffice Draw to do this by telling --convert-to pdf:draw_pdf_Export.
Is there also a way to tell LibreOffice to produce a PDF/A document in headless mode?
For PDF/A-1(means PDF/A-1b?):
soffice --headless --convert-to pdf:"writer_pdf_Export:SelectPdfVersion=1" --outdir outdir input.pdf
Change the value from 1 to 2 for PDF/A-2, here is the Libreoffice source code Common.xcs, pdfexport.cxx and pdffilter.cxx.
(Maybe outdated) API/Tutorials/PDF export - Apache OpenOffice Wiki
Python Guide - PDF export filter data - The Document Foundation Wiki
excel->pdf変換 command のdpi設定 - Ask LibreOffice
Change default resolution in batch PNG conversion [closed] - Ask LibreOffice
Since with LibreOffice is possibile only via GUI, for command line solution use gs
First convert pdf to ps
pdftops input.pdf input.ps
Then convert ps to pdf/a archival format of PDF
gs -dPDFA -dBATCH -dNOPAUSE -dNOOUTERSAVE -dUseCIEColor -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite -sPDFACompatibilityPolicy=1 -sOutputFile=input-A.pdf input.ps

How can I convert pdf to asciidoc using pandoc?

I am trying to convert pdf book to asciidoc document.I have tried the following command:
pandoc -s s.pdf -t asciidoc -o example28.txt
I got "Unknown reader" problem.
q#q-ABRA-A5-V12-1:~/Downloads$ pandoc -s s.pdf -t asciidoc -o example28.txt
pandoc: Unknown reader: pdf
Pandoc can convert to PDF, but not from PDF.
How can I fix this or is there another way to convert from pdf to asciidoc?
Have you tried pdf2txt?
https://pypi.org/project/pdfminer/
It's one of the tools provided there.

How to command line extract svg from pdf using inkscape?

Is there a command line option for asking inkscape to extract svg from pdf page 3 (for example)?
The command I use now is
$ inkscape -f test.pdf -l test.svg
but I would like also the option to export a specific page from this pdf.
What about extracting the page you need with pdftk (or in fact, any other suitable tool) first:
mypage=$(mktemp -u XXXXXX.pdf)
pdftk test.pdf cat 3 output "$mypage"
inkscape -l test.svg "$mypage"
rm "$mypage"
(It would be nice to be able to pipe the output from pdftk directly to inkscape. Unfortunately, when provided from stdin, data are expected by inkscape to be svg. A named pipe doesn't help either, because inkscape seems to attempt to traverse pdf files more than once.)

Ghostscript loses font while extracting the page from PDF

I split PDF into pages with help of usable command line:
for G in $(seq 1 $(pdfinfo 47.pdf | sed -n 's/Pages:[^0-9]*\([0-9]*\).*/\1/p')) ; do
gs \
-dSAFER \
-sDEVICE=pdfwrite \
-dBATCH \
-dNOPAUSE \
-dFirstPage=$G \
-dLastPage=$G \
-o $G.pdf \
47.pdf ;
done
But some pages appears without text (Graphics are still present)
So, I have tried to extract embedded font from PDF:
gs -q -dNODISPLAY extractFonts.ps -c "(47.pdf) extractFonts quit"
These fonts I have installed in system Fonts folder.
After that, I have repeat splitting and no changes were happened.
How-to be sure that pages will be extracting correctly, I have no idea now.
Ghostscript and pdfwrite are not actually intended for the purpose of splitting PDF files up, there are other tools which will probably work better, why not try pdftk ?
If you really want to use Ghostscript then I would advise you to get hold of the latest bleeding-edge code from the Git repository, in that code the pdfwrite device will accept an output file name containing a '%d' and will write one file per page.
Beyond that, it seems most likely to me that you are simply experiencing a bug, rather than 'losing the font', if the font was missing the text would still be ther but in a differnt font. Which version of GS are you using ?

GhostScript alternative

Im currently using CentOS 5.6 (Ghostscript 8 - ImageMagick-6.2.8 )
and im trying to convert the first image of the pdf to a jpg file.
I understand that my current setup is unable to convert compressed pdf files, but is there an alternative that it can use with the same functionality?
The 'understanding' that Ghostscript is unable to convert 'compressed PDF' is wrong. Where did you pick it up?
PDF by default uses compression internally for most its objects. It's rather unusual to find a PDF 'in the wild' which is completely uncompressed.
Which exact version of Ghostscript are you using? (Try gs -v).
BTW, you do not need ImageMagick to convert (multipage) PDF to a series of JPEGs. Try this command:
gs \
-o img_%03d.jpeg \
-sDEVICE=jpeg \
input.pdf
or, for a resolution of 300 dpi (instead of the default 72 dpi):
gs \
-o img_%03d.jpeg \
-sDEVICE=jpeg \
-r300 \
input.pdf
The _%03d-part of the output filename will attach a 3-digit number to the img-name that increments with each PDF page.