Pdf Editing: change page size WITHOUT resizing content or rotate page WITHOUT rotating content - pdf

I receive postage labels from a supplier as single page pdf documents. The labels would fit on an A5 sheet but they are presented as a portrait within an A4 page, also in portrait orientation. I would like to be able to print two of these labels per A4 page to cut down on waste.
This can be achieved by rotating the page content without rotating the page itself. Or by resizing the page by swapping the height and width about the content. I am aware that both of these things can result in content being lost, which isn't a problem for my use case. Ideally I'd like a command line application that works on both Linux or Windows machines. Unfortunately, web searches for "rotate" or "resize" pdf will point to the many applications that just rotate or resize pdf pages along with the content which isn't what I want.
Similar questions:
With PdfBox: identical use case, see my comments on PdfBox below.
With iText: almost identical use case, I explicitly don't want any resizing of the content. See my comments on iText below as well.
Things I have investigated tried:
pdftk - too basic
ImageMagick - the original image contains transparency and the extent argument results in a visible loss of quality
pdfjam - also requires install of Latex and PdfPages. Ideally I'd like something that works on both Windows and Linux.
iText7 - the documentation isn't great. Looks like it was completely re-written in the last few years and the Nuget feed makes it clear that previous version, iTextSharp, is EOL. Consequently most of the examples one finds online (including on this site) are out of date. iText7 doesn't let you resize a page. I got as far as saving a document with a new page that was the right size but struggling to copy the content over. I think I could get what I wanted from this but it would take a long time and I'm trying to do something simple.
PdfBox - I've already tried one .NET library without success. Looking at the comments to the question I've linked above, this one seems to also have a version issue. I'm trying to do something really simple here, I will try this one if I exhaust all other avenues
Gimp - does what I want but I have to fire up the application, point and click quite a few times to rescale the image canvas, set the background and export
Screenshot the label from a pdf reader at 100% size and paste into a Word/LibreOffice doc. Sadly this is the most reliable method I have at the moment
I have example labels but they contain the name and address of people I've sent things to, I'd rather not upload them.

Try the command line tool cpdf from here: https://community.coherentpdf.com
cpdf -rotate-contents <angle> in.pdf -o out.pdf
to rotate contents without rotating the page. or...
cpdf -mediabox "100 100 600 500" in.pdf -o out.pdf
(and -cropbox and so on) to change page dimensions without altering content. Chapter 3 of the manual is of relevance.
You can also prepare the file by removing any page rotation whilst counter-rotating the content to leave the visual appearance unchanged:
cpdf -upright in.pdf -o out.pdf

Related

How can I renderize a PDF into BMP fitting content to PDF page boundaries?

I am getting a BMP from a PDF with GhostScript, but its content is not fitted into page boundaries. Even I try any option, I am not able to get the content fitted.
I've tried to generate the BMP with different GhostScript options, but noone seems to fit 100% ok the content.
This is the last command I tried. Please, don't expect it to have what I need, just copied & paste from tty.
gs -dBATCH -dNOPAUSE -sPAPERSIZE=a4 -dFIXEDMEDIA -dPSFitPage -sDEVICE=bmpmono -sOutputFile=Betlem.bmp -g1184x968 -c "<</PageSize [900 500]>> setpagedevice 0 0 translate" -c "<</PageOffset [-23 -100]>> setpagedevice" -f Betlem.pdf
I am expecting to get the content fitted into the BMP image borders, without exception of a pixel. I am using an OpenCV & Python function to extract content and fit in new image and this is the debug:
initial BMP image resolution = (872, 900)
BMP image resolution after fit content into new page = (541, 870)
Have a look to the following thread for the fitting funtion in Python:
I can't find a way to fit contour on new image zero point
You are using PSFitPage for a PDF file, you should be using PDFFitPage or just FitPage.
Note that the 'fitting' in this case is fitting the PDF media size to the existing media. If the PDF content leaves white space around the edge of the media, then the resulting scaling will include that.
In addition you are using PostScript to offset the page origin, which will introduce white space, and you are trying to change the media size, which won't work because you've set -dFIXEDMEDIA. Using these in combination with any of the FitPage switches is not likely to work well.
Randomly stabbing at controls and copying bits of code intended to solve different problems isn't likely to help you I'm afraid.
Without seeing an example file I can't, of course, tell you how to solve your problem, and I'm not really sure exactly what you are trying to achieve. A bitmap with no white space ? A bitmap of a given size with no white space ? Something else ?
[Edit]
OK so looking at the PDF file, the media box is 11.69x8.27 inches, there is white space at the top, bottom, left and right between the marks on the page and the edge of the media.
Running this through Ghostscript, to TIFF at 72 dpi results in a file which Adobe Photoshop says is 11.694x8.264 inches and has white space at top bottom left and right, just like the PDF file.
By default Ghostscript uses the Media size from the PDF to render to, however you can change this. If you were to change the media size to (say) 5.8x4.14 inches, set -dFIXEDMEDIA and then rendered the PDF file what would happen is that the top and right hand side of the PDF file would be 'off the page' so you would only get the left hand portion rendered. Try this:
gs -DEVICEWIDTHPOINTS=421 -dDEVICEHEIGHTPOINTS=298 -dFIXEDMEDIA "A betlem m en vull anar(1).pdf"
You will see the white space is still present at bottom and left, and the top and right have fallen off the page.
Now, if you add FitPage that will scale the original media down until it fits the new media size (and all the content too, of course). If you try:
gs -DEVICEWIDTHPOINTS=421 -dDEVICEHEIGHTPOINTS=298 -dFIXEDMEDIA -dFitPage "A betlem m en vull anar(1).pdf"
You'll see that the output is the same physical dfimensions as the previous command, but now the whole of the PDF content can be seen because its been scaled down. You should also see that the distribution of white space has changed, because I didn't strictly divide by 2 in each direction. The FitPage switch scaled the content in both directions by the same amount, and distributed the extra space in the x direction evenly to each side, as new white space.
Now I've no clue what you mean by 'simmetric'. You can undoubtedly do what you want using Ghostscript and the PostScript language, but I don't know what it is you want. Pointing me at Python code isn't going to help I'm afraid, I don't speak Python.
I can say that Ghostscript does not add extra white space that isn't present in the original unless you mess with the rendering by addding parameters like FitPage and FIXEDMEDIA.
If you can explain what you are trying to achieve I can probably tell you what to do.

Ghostscript add white background image

I have a script which automatically adds a gutter to a PDF file. It adds gutter to left for ODD numbered pages and gutter to the right for EVEN numbered pages. It does this by moving the existing image over.
Here is the code for that:
'gs -sDEVICE=pdfwrite -dPDFSETTINGS=/printer -o output.pdf \
-dDEVICEWIDTHPOINTS=513 \
-dDEVICEHEIGHTPOINTS=738 -dFIXEDMEDIA -c \
"<< /CurrPageNum 1 def /Install { /CurrPageNum CurrPageNum 1 add def CurrPageNum 2 mod 1 eq \
{-4.5 0 translate} {4.5 0 translate} \
ifelse } bind >> setpagedevice" -f input_file.pdf
I've found that when I send this PDF file to the printer, the additional space is not "counting" so the file is now narrower now. I think this is because transparency doesn't count on the PDF, and so when sent to the printer the pages are seen as narrower.
Is it possible to add a white background to the pdf so it ISN'T seen as transparent? Or is there an alternative way to fix this?
I'm afraid your assumption is flawed, your 'translate' has no transparency involvement at all, its shifting the content on the media (NB this is not an image, ie a bitmap, in general. Its more complex content). All the content is shifted, no matter whether it is transparent or not.
I'm afraid I can't follow what you mean about the printed page being 'narrower'. The Media request will be for a page 513x738 points, which is a really weird size; 7.125 by 10.25 inches. Unles that matches the page size of your printer, then its going to do 'something' with the result. Probably it will center it if the media is larger than the request, but if the media is smaller than requested, then it will either scale it down or crop it. Either will result in something different to what you expect.
Is there a reason you are changing the media size of the original PDF file ?
If the media request does match the printer then its still possible that there will be cropping or scaling going on, because the printable area may not be the same as hte size of the media. The paper handling of some printers means that they cannot print all the way to the edge of the media. In that case the printer may scale or crop the output again.
You can easily elimiate transparency as being the culprit by simply starting with a test file which does not contain any transparency. If you aren't certain then one solution owuld be to use a recent version of Ghostscript and use the pdfimage32 device. That will create a PDF file from the original PDF, but the output file will only contain a bitmap image, no transparency at all.
To help us consider the problem, it would be helpful to see the original PDF file, the PDF file you send to the printer, and a scan or photograph of the final printed page. It would also be useful to know the version of Ghostscript you are using, the make and model of the printer, and how you are sending the PDF file to the printer.

How do you create a PDF for Kindle Direct Publishing with wkhtmltopdf

Kindle Direct Publishing AKA Create Space, wants a PDF/X-1a in 6x9 format with 0.25" outside margins and 0.375" inside/gutter margins, which I need help with, since Qt PDF generator does not do inside and out, so I have to set them both for the largest, and I need to know if my css effects this, if so how, but setting --margin-left .375in --margin-right .375in gives me this error: Currently all margin units must be the same, not sure what that means, why have a left and right if they must both be the same, and does this really have to apply to top and bottom, what is the thinking, so I added it to top and bottom just to make the file, but it is not what I wanted for margins, I wonder if gs can fix this?
If so how.
I know that wkhtmltopdf currently only creates PDF version 1.4, and Kindle does not seem to mind that much on upload, I do not have a published upload yet, so I hope someone has and knows this from experience, because I do not know if they will accept that yet, so I also use Ghost Script to convert that to PDF version 1.7, this is what I have currently:
PDF_Combine is a bash array of files:
PDF_Combine=("file1.html" "file2.html");
Update: Now KDP wants .875in margins, on both sides my content is real small, how dose CSS effect Margins in a PDF, can I set the Margins to 0 in wkhtmltopdf and adjust them in my CSS, if so how, in the body?
wkhtmltopdf --margin-left .375in --margin-right .375in --margin-bottom .375in --margin-top .375in --page-width 6in --page-height 9in --load-error-handling ignore --javascript-delay 3333 --enable-forms --footer-center "[page]/[topage]" "${PDF_Combine[#]}" "/MyPath/MyFileName.pdf"
The Ghost Script:
gs -dPDFA -dBATCH -dNOPAUSE -sProcessColorModel=DeviceCMYK -sDEVICE=pdfwrite -sPDFACompatibilityPolicy=1 -sOutputFile="/MyPath/MyOutPutFileName.pdf" "/MyPath/MyInPutFileName.pdf"
PDF/X-1a is a highly specific restricted set of PDF features; for instance you cannot use the RGB colour model when producing a PDF/X-1a
You also need specific keys to be present in the Info dictionary of the PDF file.
Ghostscript does not create PDF/X-1a files. It can create PDF/X-3 files, but that's no good to you.
You can process a PDF file using Ghostscript to add white space around the page, what you need to do is specify a larger media size, and tell Ghostscript its a fixed size, so the PDF file can't change it. Then you need to offset the content up and right on the new media (because otherwise the content will be rendered in the bottom left corner). Note; its not clear to me if you want to increase the media size, or reduce the content so that it fits into the required media, but has margins.
See the answer and comments on the question here.

How to rasterize "big" PDF files without losing thin lines?

I'm trying (in a script on a linux server) to shrink and rasterize several thousands PDF files that come from various CAD/CAM softwares and represent "big" drawings (as in, 800x600mm or the like) with lots of thin lines (as in, similar to a 0.2mm pen).
The rasterized files should have visible lines when printed on A5 or similar paper, so I have to kind of "shrink" the original drawing while preserving line thickness. As an example, when I open one of those PDF files on Mac OSX Preview, it does exactly that: when I zoom in and out it adjusts line thickness so they always look the same on screen.
I tried doing that with ImageMagick and tried lots of -density, -resize and various other settings without great success: the thin lines just get scaled down as anything else and end up being too thin (or to disappear completely, in some cases) to be discernible when printed to a small size. I've also read through its documentation without any success. Of course I'm also open to using other tools, as far as I can script it.
How could I "preserve line thickness" when rasterizing a vector PDF file in a script, just like Apple's Preview does when viewing the same file on screen?

Undo Pdfnup Operation

I have a Pdf file which contains several slides per page, including text (not only images).
This pdf was probably created using pdfnup.
Can I revert the pdfnup operation so that each slide is shown on one page?
As far as I know, there is no simple to be used 'undo' operation.
However, the following answers show you the approach principle, how you can achieve the undo-equivalent operation using Ghostscript:
Convert PDF 2 sides per page to 1 side per page (Superuser)
How can I split a PDF's pages down the middle? (Superuser)
Cropping a PDF using Ghostscript 9.01 (Stackoverflow)
PDF - Remove White Margins (Stackoverflow)
(Should these not help you to find the final solution, ask again. But then to come up with a fully working commandline, I'd need the complete output of the following command first: pdfinfo -f 1 -l 100 -box your.pdf.)