PDFBOX - need to have text located the total width of the page - pdf

I was using the build in program from the jar to convert a text file to pdf.
The text in the file is formatted to use the total page width. There are no margins
in the test file. The pdf is created with margins. How can I create the pdf and retain
the text to the side of the page. I use this pdf with the overlay to put the text on top of a form background for sending invoices with email.
java -jar pdfbox-app-2.0.19.jar TextToPDF -standardFont Courier
I expected the text to start at the left edge of the paper. I expected no margins at all. Full edge to edge. Like printing to a dot matrix printer. That is the input to the pdf.

Related

creating pdftk watermark file from command line

I need pdftk to watermark a pdf. I’m generating the content of the watermark programatically and write it out into a text file. Then I use cupsfilter to create the watermark pdf, and then pdftk to apply the generated watermark pdf onto an eBook pdf.
I understand that pdftk applies page by page watermark pdf onto eBook pdf.
If I create a 62 line text file, with 61 empty lines and watermark text on 62nd, then it gets applied properly at around 5/6 of the page height on every page of the eBook pdf.
I add one more empty line, the watermark text disappears. It does not end up on the next page, it is simply not there.
My ultimate goal is to have the watermark text at the bottom of the second page of the eBook
So I would need to create a 3 page pdf, having the first page empty, watermark text at the bottom of the second page and an third page again empty
I tried to insert page break using BBEdit into the text file, but I do not get the expected result.
does anybody have a hint how could I create the required text file which once printed out with cupsflter into a pdf will create the needed watermark pdf (first and third page empty and line or two of text at the bottom of the second page)
OK, so first, the manual is not entirely clear about difference between stamp and multistamp, and background and multibackground - it explains that the watermark pdf will be applied page by page onto eBook pdf if the watermark pdf is a multipage pdf, and that if the watermark pdf has fewer pages than the eBook pdf, the last page of the watermark pdf will be applied to all surplus pages of the eBook, and this is correct, but only in case of multistamp/multibackground option. If you use stamp/background option then only the first page of the watermark pdf will be applied to all pages of the eBook pdf, this was the first to figure out.
So I created two txt files using echo, one empty (one space in it) and one with one line of watermark text. Than I used pdftk cat option to merge the empty pdf with the watermark pdf, so I got two page pdf having first page empty and second with the line of text. Than I merged this file once again with the empty pdf, and ended up with 3 pages pdf.
Then I applied this 3 page watermark pdf with multibackground option to the eBook and got what I wanted - first page no watermark, second page the line of text and third an all other pages with no watermark.

Text changed to graphics, still selectable in PDF?

I have this PDF ebook with selectable text - the handwriting - but there is no such font embedded and the letters are all different, so it's not actually a font. How is this possible?
I've worked with CorelDraw and Adobe Acrobat, but I can't understand how this works.
The left side of the picture shows the document properties, the right side shows a page of the PDF file and I selected the last 3 rows. I can copy and paste that to a text file, no problem. How was this achieved?
There are a few possibilities but the most likely is the text is being converted to outlines/paths or vectors. Some software such as Adobe InDesign and other print design apps allow you to 'flatten' a font based text into vector or paths, meaning the original font isn't required to be embedded or installed on the system. The original text data is however still present and able to be copied into a text field or word processor.

PDFBox generate so blacked line when I zoom out

When I try to print lines using PDFBox, it creates line so blacked when I zoom out generated pdf file.
I'm creating a dashed pattern using content stream with line methods (moveTo, lineTo). For dash pattern and setting specific size I use methods (lineWidth, setLineDashPattern).
You can see code on my github repo (https://github.com/dmmax/pdfbox-dotted-pattern/blob/master/src/main/java/me/dmmax/pdfbox/dottedpattern/Main.java)
Below picture with opened two files: my result (left side) and example how it should look like (right side). Zoom of both files is 50%.
Or you can check on your computer, just download two files:
1) My result: https://github.com/dmmax/pdfbox-dotted-pattern/blob/master/print.pdf
2) Example: https://github.com/dmmax/pdfbox-dotted-pattern/blob/master/informationyoushouldknow.pdf
Does anyone know how to fix blacked lines when I zoom out result pdf?
Thank a lot to #TilmanHausherr with his big help in this question.
If you have so blacked line(-s) in zoom out of pdf then this happens because pdf render a lot of small objects but in zoom out size have the same (or close to it) size.
For me resolve this problem is generate dot/dash pattern (with needed count of lines) in another pdf and after that I convert pdf to XObject and print on my current pdf.
Yes it takes up more space, but there are no blackouts

How to resize a PDF page with itext without scaling the content (in Java)

I have been trying for days to find a solution for my problem: I want to resize an existing pdf from A4 to a given individual smaller page size. And I need the real page size to be changed, not the crop box or something like that.
The original pdf will always consist of only one page and all content (e.g. texts (some with hyperlinks), images and tables) will fit into the wanted page size. In fact I want to trim the pfd page to a rectangle that exactly fits to the existing content (the content starts at the left upper corner).
As I found no way to change the page size of an existing pdf page, I tried to create a new pdf with the wanted page size and copy all the content of the original pdf to the new pdf. But that doesn't work either (I can create the new pdf page with the wanted size, but I cannot copy the content).
Any solution (iText 5 or 7) is welcome.

How can I easily crop a PDF page?

How can I easily crop a PDF page in a given PDF file? I prefer using as little coding as possible, and guess border geometries as little as possible...
There are several options:
Crop by point-and-click using a GUI front-end:
pdf-quench
krop
briss
PDF scissors
Crop by using the command line:
pdfcrop command (provided by texlive-extra-utils), using the following arguments: pdfcrop --margins '-30 -30 -250 -150' --clip input.pdf output.pdf (-left -top -right -bottom format).
PDFCrop
convert -crop command (provided by imagemagick)
Ghostscript
Crop by writing your own script:
Python
LaTeX
For quick, GUI-aided PDF cropping tasks, try pdfarranger (available in Debian repos, formerly known as PDF-Shuffler).
For precise point-and-click cropping, one option is to use LibreOffice Draw.
The instructions below assume you want to crop part of a single-page PDF:
Start with a blank document
Select the Insert > Image... menu
Navigate to the PDF you wish to crop
The contents of the PDF will show up as an image
Right-click on the PDF content in your document and select the "Crop" menu item.
Use the handles to resize the viewable area of the PDF to the section you want to remain after cropping
Click outside of the PDF to disable the crop handles
Click again on the PDF content to position it however you want by:
Dragging it around the page
Using the arrow keys to move it
Use the Draw positioning tools to align or center the PDF content.
When you're happy with the result, save, export it to PDF, or print it.
For multi-page PDFs, You'll have to work page by page by first splitting the PDF into multiple pages using some other tool like PDF Arranger (or simply "Printing to PDF" each page of the PDF you want to crop in your PDF viewer), cropping them one by one with Draw, then recombining them into a single PDF (using PDF Arranger again).
You could try using the pdfCropMargins Python program (https://pypi.org/project/pdfCropMargins/) with the -pg option to select the particular page. The command-line program offers many options, and also has an optional GUI.
You can use Inkscape to losslessly crop PDFs. This uses Inkscape's built-in SVG-PDF conversion.
Open your file in Inkscape: File -> Open -> select your file -> Open
Resize PDF:
Using user-input values: File -> Document properties -> Page -> Custom size
Using auto resize to content: File -> Document properties -> Page -> Custom size -> Resize page to content... -> set desired margin -> Resize page to drawing or selection
Inkscape is a particularly good option as often PDF crop utilities (such as krop, mentioned in other answers) do not change the actual size of the object, instead adjusting how much of the object (e.g. an A4 page) is displayed.
E.g. from krop homepage:
Unfortunately, there is no simple way to eliminate
unnecessary/invisible parts of a PDF file. krop only adjusts which
parts of a PDF are displayed; the original content is still there in
the file and will, for instance, show up when editing the file in
inkscape
Editing directly in Inkscape does exactly what this says is impossible.
The list of tools provided by #sparkler was interesting, but did not help me very much.
Some of the tools provided, actually cropped my pages, but usually they involved some conversion to an image which made pdf files blurry and hard to read.
In the end I used podofocrop of PoDoFo tools which was able to retain all the graphics at full resolution and the text as real text.
It will crop all pages to the minimal size (i.e. without a border).
The command is: podofocrop input.pdf output.pdf
To install on MacOS use brew install podofo