Convert PDF to PCL using Ghostscript 9.15 - pdf

Requirement is to convert PDF to PCL with a macro embedded (currently testing this on Windows, however I will need to use this runtime in the application and print it from UNIX). The macro will be used later in another document to embed this cropped image and printed on one single page. I will be using PCL escape codes to call the MacroNumber and then the image will be printed. (You can consider this as a logo image.)
I am able to convert the PDF with whitespace to just the PDF without any whitespace by using CropBox.
"c:\progra~1\gs\gs9.15\bin\gswin64.exe" -o _sourcePDFcropped.pdf \
-sDEVICE=pdfwrite -c "[/CropBox [1 140 320 650] /PAGES pdfmark" \
-f _sourcePDF.pdf
However, when I convert this _sourcePDFcropped.pdf to PCL, this still adding whitespace.
"c:\progra~1\gs\gs9.15\bin\gswin64c.exe" -dBATCH -dNOPAUSE \
-sDEVICE=pxlcolor -g100x200 -sOutputFile=_sourceFedGroundCroppedTest.pcl \
-f _sourceFedGroundCropped.pdf
I tried using MKPCL and it does the job. Because it doesn't have much support, I am trying to use Ghostscript.
MKPCL.EXE -c4 -t -m 100 -p Image.jpg Image.MAC
I also tried ImageMagick which internally uses Ghostscript. So I am guessing, if I use the right switches in GS, I should be able to achieve my goal.
Input PDF File: Click Here
P.S: I have seen other PDF to PCL queries on Stackoverflow, others are more of straight forward PDF to PCL. Mine is to crop the PDF and output should be PCL.
Question continued: Link

I processed the sample input PDF with the following command line, using a self-compiled Ghostscript v9.16 (unreleased, from current GhostPDL GIT sources):
gs -o - \
-sDEVICE=pdfwrite \
-c "[/CropBox [1 140 320 650] /PAGES pdfmark" \
-f source.pdf \
\
| gs -o tst.pcl \
-sDEVICE=pxlcolor \
-dUseCropBox \
-f -
(As you may well have noticed, I'm connecting 2 different Ghostscript commands through a pipe in order to save writing a temporary PDF file to disk.)
If you want to do the same on Windows, the command line in a cmd.exe/DOS box would be:
gswin64c.exe -o - ^
-sDEVICE=pdfwrite ^
-c "[/CropBox [1 140 320 650] /PAGES pdfmark" ^
-f source.pdf ^
^
| gswin64c.exe -o tst.pcl ^
-sDEVICE=pxlcolor ^
-dUseCropBox ^
-f -
Then I opened it with the self-compiled PCL viewer (also from GhostPDL sources), pcl6:
pcl6 tst.pcl
This is a screenshot showing the pcl6 window:
As KenS also pointed out: it is important to use -dUseCropBox when processing the cropped PDF intermediate data!

Adding a CropBox doesn't really do much, it leaves the PDF exactly the same, but adds a CropBox entry for the page. GS will usually use the MediaBox, not the CropBox, so adding a CropBox to a PDF has no effect.
You could try adding -dUseCropBox. If the white space you think is being added is in fact present in the original PDF, but masked by the CropBox, then using -dUseCropBox will have GS use the CropBox when rendering the PDF.

Related

How to trim unwanted text in PDF?

I can crop PDF (for example in Acrobat).
But text outside of the crop margin will still be maintained in the PDF (even though I don't see it in the viewable area).
I want to remove anything outside the crop margin. Is there a command line tool that can do so?
Ghostscript can do that. Ghostscript is a command line tool which is available for all major operating systems.
The command which does it for Linux or Mac OS X:
gs -o cropped-and-removed.pdf \
-sDEVICE=pdfwrite \
-dUseCropBox \
in.pdf
The command for Windows:
gswin64c.exe -o cropped.pdf ^
-sDEVICE=pdfwrite ^
-dUseCropBox ^
in.pdf
Be sure to use a rather recent version of Ghostscript. Current is v9.16.

Ghostscript: How to convert EPS to PDF with the same page size

I want to convert a EPS figure to a PDF figure with the same width and height.
The following command:
gswin32 -dSAFER -dNOPLATFONTS -dNOPAUSE -dBATCH \
-sDEVICE=pdfwrite -sPAPERSIZE=letter -dCompatibilityLevel=1.4 \
-dPDFSETTINGS=/printer -dCompatibilityLevel=1.4 -dMaxSubsetPct=100 \
-dSubsetFonts=true -dEmbedAllFonts=true -sOutputFile="test.pdf" \
-f "test.eps"
only produces a PDF file with the page size of a letter.
Any help would be much appreciated.
Here is the test EPS file: https://dl.dropboxusercontent.com/u/45318932/test.eps
EPS files cannot contain a media size request. In the absence of any media size request Ghostscript uses the default.
However.....
From the documentation:
http://www.ghostscript.com/doc/9.15/Use.htm#EPS_parameters
-dEPSCrop :
Crop an EPS file to the bounding box. This is useful when converting an EPS file to a bitmap.
To make KenS' answer more explicit, using the test.eps sample file you linked to... the following command will suffice to do what you want:
gswin32 \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/printer \
-dEPSCrop \
-o test.pdf \
test.eps
The -o test.pdf is (for not too ancient versions of Ghostscript!) shorthand for -dNOPAUSE -dBATCH -sOutputFile=test.pdf.
Your test.eps uses a font named /SHZENL+Tahoma_00. Ghostscript will automatically embed this font, and it will be a subset by default (the prefix SHZENL may change in the PDF, though).
Here is a screenshot from the page the command from your question created. That page is 612 x 792 pts (letter size):
Here is the screenshot from the page the command given in my answer created. Its page size is 360 x 216 pts:

Ghostscript loses font while extracting the page from PDF

I split PDF into pages with help of usable command line:
for G in $(seq 1 $(pdfinfo 47.pdf | sed -n 's/Pages:[^0-9]*\([0-9]*\).*/\1/p')) ; do
gs \
-dSAFER \
-sDEVICE=pdfwrite \
-dBATCH \
-dNOPAUSE \
-dFirstPage=$G \
-dLastPage=$G \
-o $G.pdf \
47.pdf ;
done
But some pages appears without text (Graphics are still present)
So, I have tried to extract embedded font from PDF:
gs -q -dNODISPLAY extractFonts.ps -c "(47.pdf) extractFonts quit"
These fonts I have installed in system Fonts folder.
After that, I have repeat splitting and no changes were happened.
How-to be sure that pages will be extracting correctly, I have no idea now.
Ghostscript and pdfwrite are not actually intended for the purpose of splitting PDF files up, there are other tools which will probably work better, why not try pdftk ?
If you really want to use Ghostscript then I would advise you to get hold of the latest bleeding-edge code from the Git repository, in that code the pdfwrite device will accept an output file name containing a '%d' and will write one file per page.
Beyond that, it seems most likely to me that you are simply experiencing a bug, rather than 'losing the font', if the font was missing the text would still be ther but in a differnt font. Which version of GS are you using ?

Remove PDF margins while using PDFFitPage in GhostScript

I'm fitting a file with no margins (produced using a pdfcrop from a normal PDF file) to a given paper size using GhostScript:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dFIXEDMEDIA \
-dPDFFitPage -d -dBATCH -dQUIET -dNOPAUSE -dDEVICEWIDTHPOINTS=864 \
-dDEVICEHEIGHTPOINTS=612 -sOutputFile=$INPUT $OUTPUT
but the output has additional margins (I was cropping in order to get rid of them).
Is it possible to force GhostScript to produce output without these margins?
Without seeing your file I cannot be certain, however I suspect that all you have done is set a /CropBox in the PDF file. By default Ghostscript uses the /MediaBox which is probably unchanged.
Try setting -dUseCropBox

GhostScript alternative

Im currently using CentOS 5.6 (Ghostscript 8 - ImageMagick-6.2.8 )
and im trying to convert the first image of the pdf to a jpg file.
I understand that my current setup is unable to convert compressed pdf files, but is there an alternative that it can use with the same functionality?
The 'understanding' that Ghostscript is unable to convert 'compressed PDF' is wrong. Where did you pick it up?
PDF by default uses compression internally for most its objects. It's rather unusual to find a PDF 'in the wild' which is completely uncompressed.
Which exact version of Ghostscript are you using? (Try gs -v).
BTW, you do not need ImageMagick to convert (multipage) PDF to a series of JPEGs. Try this command:
gs \
-o img_%03d.jpeg \
-sDEVICE=jpeg \
input.pdf
or, for a resolution of 300 dpi (instead of the default 72 dpi):
gs \
-o img_%03d.jpeg \
-sDEVICE=jpeg \
-r300 \
input.pdf
The _%03d-part of the output filename will attach a 3-digit number to the img-name that increments with each PDF page.