I am using the following command to crop a PDF:
gswin32c -dQUIET -dBATCH -dNOPAUSE -dNOPROMPT -sDEVICE=pdfwrite -dFirstPage=1
-dLastPage=1 -o output.pdf -dDEVICEWIDTHPOINTS=237 -dDEVICEHEIGHTPOINTS=151
-dFIXEDMEDIA -c "<</PageOffset [-64 -396]>> setpagedevice" -f input.pdf
My intention is to crop the input.pdf 3.4 inches from top boundary, 0.9 inches
(0.9 x 72 = 64) from left boundary, 5.5 inches (5.5 x 72 = 396) from bottom boundary, 4.3 inches from right boundary, and everything seems to work fine and the output.pdf displays appropriately cropped while viewed on a desktop (Adobe, Chrome Browser etc). However when The same PDF is viewed on iOS or Android device, seems like the
page shift has occurred along the dimensions mentioned and areas from uncropped
regions are still visible. It almost seems like the page size did not work
appropriately.
Reading some online forums, I have also tried the following commands but none
seems to have any cropping effect, even on a desktop viewer:
gswin32c -dQUIET -dBATCH -dNOPAUSE -dNOPROMPT -sDEVICE=pdfwrite -dFirstPa
ge=1 -dLastPage=1 -o outupt.pdf -c "[/CropBox [64 396 237 151] /PAGES pdfmark"
-f input.pdf
gswin32c -sDEVICE=pdfwrite -dFirstPage=1 -dLastPage=1 -o output.pdf -dDEVI
CEWIDTHPOINTS=237 -dDEVICEHEIGHTPOINTS=151 -dFIXEDMEDIA -c "237 151 translate 6
4 396 237 151 rectclip" -f input.pdf
Any help offered is very much appreciated unless this is a bug!
Many Thanks,
Kaushik
This was opened as a Ghostscript bug report, investigated and answered there, you can find the details at:
http://bugs.ghostscript.com/show_bug.cgi?id=693081
Related
The application receives PDF with content that does not cover the entire page. Co-ordinates and dimensions of the content is also sent along with that data. Below is the sample command to create cropped.png from target.pdf where the content starts (X: 179, Y: 212) and the size (W: 600, H: 400).
gswin64c -q -sDEVICE=png16m -o "cropped.png" -dNumRenderingThreads=4 -dSAFER -dBATCH -dNOPAUSE -dLastPage=1 -c "<< /PageOffset [-179 -212] /Page >> setpagedevice" -r144 -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -f "target.pdf"
It creates the PNG with the correct offset, however I could not find a way to define the area of the PDF that should be captured. In other words, the output must be a PNG that only contains the content within the given box.
GhostScript V9.22 is installed on Windows 10. I have ImageMagicK at my disposal too.
Is there a way to achieve this? Or am I approaching this incorrectly? Any thoughts would be greatly appreciated!
Thank you
You've offset the origin, but you haven't specified the media size, so the media size will be whatever it was in the original PDF file.
You need to add -dDEVICEWIDTHPOINTS=600 -dDEVICEHEIGHTPOINTS=400 -dFIXEDMEDIA, assuming that the media size you've given is correct. Obviously I can't tell where the actual white space is without seeing the original!
At 144 dpi there's no point in using -dNumRenderingThreads, realistically that's only useful at reasonably high resolution, at this resolution you'll just slow it down.
This:
-c "<< /PageOffset [-179 -212] /Page >> setpagedevice" -r144 -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -f
Is incorrect, and I'm surprised it doesn't throw an error. When you specify -c everything after that, until the -f is treated as PostScript. The -r144 etc are not valid PostScript and I'd expect it to throw an error. You would be much better to move the -c "<< /PageOffset [-179 -212] /Page >> setpagedevice" -f to immediately before the input filename.
So I'd suggest your command line should be:
gswin64c -q -dSAFER -dBATCH -dNOPAUSE -sDEVICE=png16m -o "cropped.png" -dLastPage=1 -r144 -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -dDEVICEWIDHTPOINTS=600 -dDEVICEHEIGHTPOINTS=400 -dFIXEDMEDIA -c "<< /PageOffset [-179 -212] /Page >> setpagedevice" -f "target.pdf"
inputPdf
Use gswin32c.exe -o nul -sDEVICE=bbox bbox.pdf,I'v hnow the BoundingBox of this pdf is
%%BoundingBox: 6292 6865 8108 7535
%%HiResBoundingBox: 6292.907808 6865.505790 8107.091753 7534.493770,
I want to get a pdf with the content in the BoundingBox.
I am using the following command to crop a PDF:
gswin32c -sDEVICE=pdfwrite -dFirstPage=1 -dLastPage=1 -o croped.pdf -dDEVICEWIDTHPOINTS=1815 -dDEVICEHEIGHTPOINTS=670 -dFIXEDMEDIA -c "6292 6865 translate 6292 6865 8107 7534 rectclip" -f bbox.pdf
or
gswin32c -dQUIET -dBATCH -dNOPAUSE -dNOPROMPT -sDEVICE=pdfwrite -dFirstPage=1 -dLastPage=1 -o croped.pdf -dDEVICEWIDTHPOINTS=1815 -dDEVICEHEIGHTPOINTS=670 -dFIXEDMEDIA -c "<</PageOffset [6292 6865]>> setpagedevice" -f bbox.pdf
i'v a blank pdf file.
this command
gswin32c.exe -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=croped.pdf -c "[/CropBox [6292.907808 6865.505790 8107.091753 7534.493770] /PAGES pdfmark" -f bbox.pdf
i'v a original file.
How can i crop this pdf correctly.
Thanks very much!
The BoundingBox looks suspicious to me.
In any event you cannot trivially do what you are trying to do with Ghostscript, because the PDF interpreter uses the information in the PDF file to set the media size.
The first two command lines 'might' work, but you've translated the CTM in the wrong direction. You've moved the origin (0,0) from the bottom left, up and right. That's moved the content of the page further off the media, which is why you get a blank page. You could try using the same values, but negated, so that the origin moves down and left. From the BoundingBox you quoted, that's the correct direction.
gswin32c -sDEVICE=pdfwrite -dFirstPage=1 -dLastPage=1 -o croped.pdf -dDEVICEWIDTHPOINTS=1816 -dDEVICEHEIGHTPOINTS=670 -dFIXEDMEDIA -c "-6292 -6865 translate" -f bbox.pdf
You don't need the rectclip, because the content is already clipped to the page.
The third command line would also work, except that you've set the CropBox before processing the PDF file, so the PDF interpreter reads the CropBox from the PDF file and overwrites the one you set. Try setting it after the input file.
gswin32c.exe -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=croped.pdf bbox.pdf -c "[/CropBox [6292.907808 6865.505790 8107.091753 7534.493770] /PAGES pdfmark" -f
[EDIT]
OK so the reason the first command lines doesn't work is (as I suspected) because the PDF interpreter resets the graphics state before running the PDF, so it simply throws away the 'translate'.
The second command line works perfectly well for me if you negate the operands in the array for PageOffset:
gswin32c -sDEVICE=pdfwrite -sOutputFile=\temp\out.pdf -dDEVICEWIDTHPOINTS=1815 -dDEVICEHEIGHTPOINTS=670 -dFIXEDMEDIA -c "<</PageOffset [-6292 -6865]>>setpagedevice" -f D:\Users\ken\Downloads\bbox.pdf
The third command line doesn't work because it sets the CropBox for all Pages, which is a default and can be overridden by setting a CropBox on each page. Your original PDF file contains a CropBox (identical to the MediaBox) which is preserved by the PDF interpreter, so the PAGES CropBox is overridden by the CropBox specific to the page.
But the command line above worked fine for me.
I have a ghostscript command that converts a pdf into several JPG images (one for every page). The command arguments are as follows:
-q -dUseCropBox -dFirstPage=1 -dLastPage=1 -dBATCH -dDOINTERPOLATE -dNOPAUSE -dSAFER -dTextAlphaBits=4 -dGraphicsAlphaBits=4 -dPrinted=false -r250 -sDEVICE=jpeg -dJPEGQ=100 -sOutputFile=output.jpg input.pdf -c quit
The pdf size is 1.5mb but in the JPGE images it becomes huge (~15MB) with dimension 8829 *15551
if I change the resolution in the ghostscript command to -r150 the page size is correct but the image quality is very rastorized.
Is there another way to decrease the image size of the image without affecting the image quality?
Thanks
I want to convert a EPS figure to a PDF figure with the same width and height.
The following command:
gswin32 -dSAFER -dNOPLATFONTS -dNOPAUSE -dBATCH \
-sDEVICE=pdfwrite -sPAPERSIZE=letter -dCompatibilityLevel=1.4 \
-dPDFSETTINGS=/printer -dCompatibilityLevel=1.4 -dMaxSubsetPct=100 \
-dSubsetFonts=true -dEmbedAllFonts=true -sOutputFile="test.pdf" \
-f "test.eps"
only produces a PDF file with the page size of a letter.
Any help would be much appreciated.
Here is the test EPS file: https://dl.dropboxusercontent.com/u/45318932/test.eps
EPS files cannot contain a media size request. In the absence of any media size request Ghostscript uses the default.
However.....
From the documentation:
http://www.ghostscript.com/doc/9.15/Use.htm#EPS_parameters
-dEPSCrop :
Crop an EPS file to the bounding box. This is useful when converting an EPS file to a bitmap.
To make KenS' answer more explicit, using the test.eps sample file you linked to... the following command will suffice to do what you want:
gswin32 \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/printer \
-dEPSCrop \
-o test.pdf \
test.eps
The -o test.pdf is (for not too ancient versions of Ghostscript!) shorthand for -dNOPAUSE -dBATCH -sOutputFile=test.pdf.
Your test.eps uses a font named /SHZENL+Tahoma_00. Ghostscript will automatically embed this font, and it will be a subset by default (the prefix SHZENL may change in the PDF, though).
Here is a screenshot from the page the command from your question created. That page is 612 x 792 pts (letter size):
Here is the screenshot from the page the command given in my answer created. Its page size is 360 x 216 pts:
I want to extract the first page of a PDF as PNG to do some image processing on it with this command:
$ gs -q -dNOPAUSE -dBATCH -dEPSCrop -sDEVICE=pngalpha -dLastPage=1 -sOutputFile='test.png' 'test2.pdf'
It works well for most PDFs but it adds a transparent margin on this one: http://ubuntuone.com/23676W4TJPyX6W2pkp5guG
Gimp does it as expected (no margin), convert has the same issue, -sDEVICE=jpeg also.
Is there any way to avoid it ?
Ghostscript doesn't add margins, and it certainly doesn't add transparent ones. THe problem is not with Ghostscript, its with your PDF file. You file contains:
/MediaBox [0 0 595 842]
/CropBox [27.5 61.0 567.5 781.0]
Ghostscript uses the MediaBox, other viewers may or may not use the CropBox. If you read the GS documentation you will find the -dUseCropBox switch which directs GS to use the CropBox of a PDF file instead of the MediaBox when setting the media size.
-dEPSCrop isn't going to do anything at all with a PDF file.
For the record, If anyone runs into the same issue, I just found the proper switch: -dUseCropBox. The final command is now:
$ gs -q -dUseCropBox -dNOPAUSE -dBATCH -sDEVICE=pngalpha -dLastPage=1 -sOutputFile='test.png' 'test2.pdf'