I have searched for but have not found any discussing on creating HTML tables in Pandoc markdown that show Gridlines between the cells.
Could this have been discussed using other keywords? I would think someone would have asked for such a thing.
I am using pipe tables create a pen and paper form to be filled out when when provisioning laptops for new users. I typically have to build 2-5 computers at the same time. Having lines between the columns and rows would be better than crating them with a pen and straight edge.
You could use the Pandoc settings to include a CSS files that adds the style to your tables.
Related
I am making some Google Docs containing tables. For each table, I set the table property to prevent "Allow row to overflow across pages". It works fine in Google Docs.
However, when I download the Google Doc to PDF, some of the rows are split across pages. This can happen even if a table has only one row.
I have looked here, and couldn't find this question, although there is a related one about HTML to PDF Prevent table cells from breaking across page when converting html to pdf.
I can't be the only one to have encountered this. Any answers please?
Thanks
This looks very much like a bug in Google's PDF exporter, I was able to reproduce it. One workaround would be to use a different PDF exporter. Download your google doc as an ODT, open ODT in Word or LibreOffice and then save as PDF. This worked for me, the PDFs saved by both Word and LibreOffice Writer had a tall row starting from the new page, as intended.
I am looking to programmatically edit the tags in a pdf document.In particular I would like to be able to copy tags from one document to another, and edit them as I copy them over.
I have looked at coherent pdf, pythons pdfrw and pythons pdfedit and not been sucessful. I am creating the pdfs in Latex so any Latex based solution would be amazing, but i have not come up with anything that allows me to create tags).
Any advise?
I am trying to laser cut multiple signs, and I need to combine PDFs into a single PDF for import into AutoCAD. The signs are all the same shape, but I need to populate the different text/image for each in the frame.
I have experience with python, and I am open to learning a new tool/software to get this done in the easiest manner possible. I would love any guidance or advice on this project.
Here is a very basic picture of how I would like the final PDF to be.
The PDF toolkit (pdftk) can merge PDFs and change pagination and/or put multiple pages onto a single page.
https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
I'm trying to go from a paper document to a searchable pdf with a table of contents.
Sometimes you will download a pdf book or document, (like for example the Intel Manual which can be seen below) This document is searchable and it also has a table of contents. Now, when you put this same document on Google Drive and then open it up with PDF Expert on an ipad, it is still searchable with a table of contents. This is what I'd like to do with all my scanned pdfs.
Now a more concrete example. Shown below is a document that I've scanned with the Fujitsu ScanSnap. It's also searchable thanks to some software that comes with the ScanSnap. So now I have a searchable pdf that can be opened up locally or on my ipad, but it doesn't have a table of contents. So my main question is: How can I add a table of contents like the one in for the Intel Manual to a scanned pdf
It seems like there's tons of people doing different things with "table of contents". Like people who are designing documents use InDesign. I think that what I'm trying to do must be simpler than that. I'm thinking that there has to be an easy way to do this using say Adobe Acrobat Pro? Something about adding "bookmarks" or "links" or "tags" to the existing table of contents. Do you know of a clear and concise way to do this using acrobat or some other software?
thanks for the help
Jpdfbookmark can work for scanned books
Watch tutorial video ≫
Step 1: Prepare the table of content
Save the TOC in a .txt file in this format:
Chapter 1. The Beginning/23
Para 1.1 Child of The Beginning/25,FitWidth,96
Para 1.1.1 Child of Child of The Beginning/26,FitHeight,43
Chapter 2. The Continue/30,TopLeft,120,42
Para 2.1 Child of The Beginning/32,FitPage
You can ORC the TOC and use regex to fix it.
Step 2: Load that TOC
Step 3: Prepare for step 4
This sounds dumb, but if you miss it you will be frustrated and have to do it again. Expand all bookmarks (Ctrl + E), select all of them, then go to Tools → Apply Page Offset
Step 4: Apply page offset
This step should be self-explained. Don’t forget to save.
That’s it. You are done. For more information, you can read its its manual. The program has command line mode and can work on Linux, Mac.
If there are non-Roman characters, be sure to use the same encoding when dumping and applying bookmarks.
I also have a complete guide to process scanned books, you may want to check it out: The ultimate guide to process scanned books.
FYI:
• How to OCR tables of contents to proper outputs?
• How can I split in half a double-page scanned PDF in a single pass?
I have done this before by combining multiple "booklets". Each "Chapter" was a series of pages combined in Adobe Acrobat Pro. I would combine chapters into separate "booklets" and then name them a chapter name, and then combine all chapters into a new booklet.
I have a collection of PDF's that sometimes have a info page for the first page of the document that I want to remove.
If there a quick way to delete this info page from all of my pdf's or at least a way to show all pdf's that have more than one page so I can better find the ones that need to be fixed?
Do you know of any program that can do this? Or way to do this with python?
Note: The info page has text on it that that always remains the same "LAND TITLE OFFICE"
Using Windows 7 OS
Thanks
Some Research turned up the following:
http://www.python.org/workshops/2002-02/papers/17/index.htm
http://www.unixuser.org/~euske/python/pdfminer/index.html
https://pypi.org/project/pypdf/
You can try these two ways:
PdfTK is an utility to manipulate PDFs. Check this link, they are doing something similar to what you need (in the comments someone also posted a script for windows)
PDFsam is a graphical powerful tool to manipulate PDFs in bulk. The split+merge sections should do the trick.
Both of them are free, I'd suggest to study the first if you want to write a "recipe" that you can use often, but the later if you have to do it once.
You can use the opensource PDFBox as a command line utility to split PDF's.
The link for PDFBox is here: link
The documentation for splitting a PDF using PDFBox is here: link
You could use the PDFBox extract text functionality from a batch script and combine with grep to identify pages that contain the text you are looking for. The extract text documentation is here: link