I am collecting quite a lot of material in a GitHub wiki. I really like to use the wiki to cooperate with other people and IMHO the platform is really nice, I like it!
So, I would like to keep using the GH wiki to collect stuff, edit, save,etc but I also would like to export the content in order to create a pdf file that we can call "a manual".
I would like to generate an updated version of the manual automatically everytime I want just running a couple of scripts, I can not put too much effort on this.
I guess it is possible to export the content somehow and the use pandoc (http://johnmacfarlane.net/pandoc/) to create the pdf maybe adding an index and a style file.
Another interesting idea could be publish a website once a month dumping content directly from the wiki.
I guess other people already did something like this but I did not find anynthing.
Any idea?
But... the Github wiki of a GitHub repo is a git repo in itself (introduced in August 2010).
You can clone it, push to it or pull from it.
Each wiki is a Git repository, so you're able to push and pull them like anything else.
Each wiki respects the same permissions as the source repository.
Just add ".wiki" to any repository name in the URL, and you're ready to go.
Or, as noted by htafoya in the comments, replace the .git part of the URL (if present) by .wiki.
That makes the "export" part of your question really trivial.
From there, you will find tons of script for converting markdown pages into pdf:
a graddle task
a makefile
a python script
...
I'm adding to this answer, in case it helps any new readers :) here's what I did:
I installed GitHub Desktop: https://desktop.github.com/
Then, on the wiki page in my repository, I clicked "Clone in Desktop"
This saved the wiki locally as a .md file (after following the steps on screen)
I then used http://www.markdowntopdf.com/ to convert it to pdf
(Note: I renamed the files to remove characters that wouldn't work in a pdf file name before uploading to the website)
The end result was really nice.
I found many of the solutions difficult to reproduce/get the right version/understand/fix/etc... So instead, I'll present a patchwork docker solution to effortlessly convert on Windows(using git bash)/MacOS/Linux in 5 "easy" commands
git clone {project_url}.wiki .
# Convert *.md to *.md.html using the actual github pipeline
docker run --rm -e DOCKER_USER_ID=`id -u` -e DOCKER_GROUP_ID=`id -u` \
v "`pwd`:/src" -v "`pwd`:/out" andyneff/github-markdown-preview
# Fix hyperlinks, since wkhtmltopdf is stricter than github servers
docker run --rm -v `pwd`:/src -w /src perl \
perl -p -i -e 's|(.*?)|\1\L\2\E.md.html\L\3\E\4|g'\
*.html
# Lowercase all filename so that hyperlink match
docker run --rm -v `pwd`:/src -w /src python \
python -c 'import sys;import os; [os.rename(f, f.lower()) for f in sys.argv[1:]]' \
*.md.html
#Convert html to pdf using QT webkit
docker run -it --rm -e DOCKER_USER_ID=`id -u` -e DOCKER_GROUP_ID=`id -u`\
-v `pwd`:/work -w /work andyneff/wkhtmltopdf \
wkhtmltopdf --encoding utf-8 --minimum-font-size 14 \
--footer-left "[date]" --footer-right "[page] / [topage]" \
--footer-font-size 10 \
toc \
*.html document.pdf
The perl is the main part that may fail without a better solution. Pandoc has a really good filter solution, but isn't using the github pipeline.
Bugs:
Extra wide code blocks will be rendered with a scroll bar, and essentially cut off in the pdf. It would be best to make the code block not overflow, but you can add --user-style-sheet user.css to the wkhtmltopdf command (before toc/cover), and add to your user.css
.markdown-body .highlight pre,
.markdown-body pre{
overflow:visible !important;
}
Some link in the final pdf are off by +1 page, some are not. Not sure what the pattern is. But anchors with ids (#) do not appear to have this problem
Another option once you clone the wiki, especially if you are already using Atom is to use this Markdown to PDF package.
Worked great for me.
I found really annoying having to convert each markdown document separately (links between markdown documents are lost), so I ended up writting a simple C# program for my own use that does this in a single step: a) Download the last version of the wiki from Github, b) Convert it all the markdown documents merged as one pdf
You can download the binaries (Windows or any platform supporting Mono) from:
https://github.com/borjafdezgauna/CoderDocTools/releases/latest
If, for example, you want to convert to PDF the SimionZoo repository by user simionsoft, you can:
MarkdownToPDF.exe user=simionsoft project=SimionZoo output-file=SimionZoo.pdf
I've accomplished precisely this when creating the portable documentation for Barcode Writer in Pure PostScript:
GitHub Wiki + Makefile + pandoc → PDF
The process is described in this blog post.
This question has already been answered but wanted to add my quick experience here.
I didn't find it necessary to install the Desktop version of Github. You can clone by simply running the following from your commandline:
git clone git#github.com:<username>/<repository>.wiki.git
(Of course, replace username and repository as needed).
The cloned wiki outputted 72 markdown files. As has been previously said, there are numerous ways of converting these files do PDF, you can pick your own tool. However I will say that the easiest solution I encountered was to install Pandoc. I have macOS + homebrew, so a quick brew install pandoc was all I needed.
Some info on using pandoc here: https://stackoverflow.com/a/14908316/3638172
You can also try html_links_to_pdf!
It's a Python 3 script made just to convert a GitHub Wiki to pdf form, using the same styling that GitHub uses, but slightly cleaner.
Related
I need to convert 800+pdf files into html webpage, and every pdf file had own page on html webpage.
I tried to make with Adobe Acrobat, but what i get was every pdf merged in one big list.
So is there any way to automatically do this?
You could use pdftohtml on Linux and make it loop through all the files in the directory.
You can also find more information about pdftohtml on this thread: How to convert PDF to HTML?
pdf2htmlEX
Preserves formatting of the PDF file
Only works through docker (On new builds of Linux, this package is not present and deb packages are not installed)
sudo docker pull bwits/pdf2htmlex
sudo docker run -ti --rm -v /home/user/Documents/pdfToHtml:/pdf bwits/pdf2htmlex pdf2htmlEX --zoom 1.3 file.pdf
I am trying to download a bunch of PDF's from the federal reserve archives but I have to click on a link and then view the PDF before I can download. Is there a way to automate this?
Example: https://fraser.stlouisfed.org/title/5170#521653 is a link to speeches and then you have to click the title, then view pdf, then the actual download button.
All of the remote .pdf files follow the path format:
https://fraser.stlouisfed.org/files/docs/historical/frbatl/speeches/guynn_xxxxxxxx.pdf
where each x is a placeholder for a digit.
So, yes, it's very easy to download a bunch of these PDFs in one go using the command-line in Terminal or whatever shell program you have access to.
If you're in a *nix-based operating system (including MacOS), that's good because your shell probably already has a command utility called curl installed. Windows may have it too, I'm not sure; I don't use Windows.
If you're using Windows, you'll have to make some tweaks to the code below, because the folder structures and file naming conventions are different, so the first couple of commands won't work.
But, if you're happy to proceed, open up a Terminal window, and type in this command to create a new directory in your Downloads folder, into which the .pdf files will be downloaded:
mkdir ~/Downloads/FRASER_PDFs; cd ~/Downloads/FRASER_PDFs
Hit Enter. Next, If there's no error, copy-n-paste this long command and then hit Enter:
curl --url \
"https://fraser.stlouisfed.org/files/docs/historical/frbatl/speeches/guynn_{"$(curl \
https://fraser.stlouisfed.org/title/5170#521653 --silent \
| egrep -io -e '/files/docs/historical/frbatl/speeches/guynn_\d+\.pdf' \
| egrep -o -e '\d+' | tr '\n' ',')"}.pdf" -O --remote-name-all
You can see this uses the URL you supplied in your question, from which that command retrieves all the .pdf links. If you need to do the same with other similar pages, provided they all use the same URL format, you can just substitute 5170#521653 with whatever page reference contains another list of .pdfs.
I'm writing company internal documentation in R markdown and compiling using knitr in Rstudio. I'm trying to add a link pointing to a directory as follows:
[testdir](file:////c:/test/)
(this is following the convention described in here)
When I compile it to html, I get the following link.
testdir
and it works as expected in Internet explorer. However, when I try to convert to pdf straight from RStudio, an unwanted pdf extension is appended to the link. I tried dissecting the problem and it seems this change is happening within pandoc. Here are the details.
When I convert it to latex using pandoc,
>pandoc -f markdown -t latex testing.md -o test.tex
the link in the latex output file looks as follows:
\href{file:///c:/test/}{testdir}
Everything good so far. However, when I convert the latex output to pdf with pandoc,
>pandoc -f latex -t latex -o test.pdf test.tex
a .pdf extension is appended to the link. Here is a copy/paste of the pdf link output:
/c:/test/.pdf
is there a way to avoid this unwanted appended extension?
Perhaps I'm asking too much of pandoc, but I thought it might be worth asking since RStudio is becoming such a useful IDE to write my dynamic documents.
As you said, the .tex file pandoc generates is fine. So the problem is actually with LaTeX, specifically with the hyperref package which is used in pandoc's LaTeX template.
The problem with two possible solutions was described here. To prevent hyperref from being smart and adding a file extensions, try:
[testdir](file:///c:/test/.)
Or use ConTeXt instead of LaTeX:
$ pandoc -t context -s testing.md -o test.tex && context test.tex
I have a Markdown file that I wish to convert to PDF so that I can upload it on Speakerdeck. I am using Pandoc to convert from markdown to PDF.
My problem is I can't specify what content should go on what page of the PDF, because Markdown doesn't provide any feature like that.
E.g., Markdown:
###Hello
* abc
* def
###Bye
* ghi
* jkl
Now I want Hello to be one slide and Bye to be on another slide on Speakerdeck. So, I will need them to be on different pages in the PDF that I generate using Pandoc.
But both Hello and Bye gets on the same page in the PDF.
How can I accomplish this?
Via the terminal (tested in 2020)
Download dependencies
sudo apt-get install pandoc texlive-latex-base texlive-fonts-recommended texlive-extra-utils texlive-latex-extra
Try to use
pandoc MANUAL.txt -o example13.pdf
pandoc MANUAL.md -o example13.pdf
Via a Visual Studio Code extension (tested in 2020)
Download the Yzane Markdown PDF extension
Right click inside a Markdown file (md)
The content below will appear
Select the Markdown PDF: Export (pdf) option
Note: Emojis are better in Windows than Linux (I don't know why)
2016 update:
NPM module: https://github.com/alanshaw/markdown-pdf
Has a command line interface: https://github.com/alanshaw/markdown-pdf#usage
npm install -g markdown-pdf
markdown-pdf <markdown-file-path>
Or, an online service: http://markdown2pdf.com
As SpeakerDeck only accepts PDF files, the easiest option is to use the Latex Beamer backend for pandoc:
pandoc -t beamer -o output.pdf yourInput.mkd
Note that you should have LaTeX Beamer installed for that.
In Ubuntu, you can do sudo apt-get install texlive-latex-recommended to install it. If you use Windows, you may try this answer.
You may also want to try the HTML/CSS output from Slidy:
pandoc --self-contained -t slidy -o output-slidy.html yourInput.mkd
It has a decent printing output, as you can check out trying to print the original.
Read more about slideshows with pandoc here.
Easy online solution: dillinger.io.
Just paste your Markdown content into the editor on the left and see the (html) preview on the right. Then click Export as on the top and chose pdf.
It's based on the open source dillinger editor.
Adding to elias' answer, if you want to separate text in slides, just put *** between the text you want to separate. For your example to be in several pages, write it like this:
### Hello
- abc
- def
***
### Bye
- ghi
- jkl
And then use elias' answer, pandoc -t beamer -o output.pdf yourInput.md.
I have Ubuntu 18.10 (Cosmic Cuttlefish) and installed the full package from texlive. It works for me.
Previously I had used the npm markdown-pdf answer. However, on a fresh install of Ubuntu 19.04 (Disco Dingo) I had issues getting it to install correctly.
Instead I started using the Visual Studio Code package: "Markdown PDF"
Details:
Name: Markdown PDF
Id: yzane.markdown-pdf
Description: Convert Markdown to PDF
Version: 1.2.0
Publisher: yzane
Visual Studio Marketplace link: https://marketplace.visualstudio.com/items?itemName=yzane.markdown-pdf
It has worked consistently well. If you've had issues getting other answers to work, I would recommend trying this.
I've managed to get a stable Markdown -> HTML > PDF pipeline working with the MarkReport project.
It is a bit more than what Pandoc will do though, since it is based on WeasyPrint and is therefore aimed for clean report publishing, with cover, headers, sections, ...
It also enriches the HTML with syntax highlighting and LaTeX equations.
Simple way with iOS:
Use Shortcuts app (by Apple)
Make Rich Text From Markdown: Clipboard
^
Make PDF from Rich Text From Markdown
^
Show [PDF] in Quick Look
Just copy text and run the shortcut. Press share in Quick Look (bottom left) to store or send it. I use this to quickly convert Joplin notes to pdf.
I found that many markdown-to-pdf converters produce files that I don't find exactly neat-looking. However there is a solution to this.
If you're using IntelliJ, you can use a plugin called "Markdown". The export function uses pandoc as an engine so you probably will need to install that along with pdf-latex. https://pandoc.org/installing.html
In IntelliJ, under Tools > Markdown Converter > Export Markdown File To...
And there you go, a clean looking document. Additional styling can be added via a .css stylesheet.
Currently Textmate uses Safari's Webkit to render the hmlt outputs for both commands and the live webpreview window.
Unfortunately for one specific project I am working with specific javascript API supported only be Firefox's gecko or Chrome's Webkit, it seems Safari still not supporting it.
Perhaps there's a way to globally change Safari for Chromium or Webkit Nightly?
A support member of TextMate kindly answered my email asking for it by mentioning this url which definitely points to the right solution. I really didn't think it could be done so seamlessly and now I am very happy that it is possible.
Basically there are few steps to follow:
$ cd /Applications/TextMate.app/Contents/MacOS/
$ mv TextMate _TextMate
$ vim TextMate
new TextMate file contains (note that you might want to change the path for the new webkit framework to fit the one you like)
#!/bin/bash
env DYLD_FRAMEWORK_PATH=/Applications/WebKit.app/Contents/Frameworks/10.6/WEBKIT_UNSET_DYLD_FRAMEWORK_PATH=YES /Applications/TextMate.app/Contents/MacOS/_TextMate
after saving the newly created file:
$ chmod a+x TextMate
Close/Run TextMate :)
Or obvious if you just want to do this for each session you can simply use the bash command right away from terminal like this:
$ env DYLD_FRAMEWORK_PATH=/Applications/WebKit.app/Contents/Frameworks/10.6/ WEBKIT_UNSET_DYLD_FRAMEWORK_PATH=YES /Applications/TextMate.app/Contents/MacOS/TextMate
This is really cool...one thing I've noticed after that is that my themes are no longer being displayed, have no clue about it but I will try to check the cause.
try this - http://wiki.macromates.com/Main/Howtos#SafariPreview