ghostscript not retaining page level parameter while merging two postscripts - pdf

I have converted pdf file to postscript using ghostscript, while conversion, I have passed page-level parameter for the duplex option as below.
gswin32c.exe -q -dSAFER -dNOPAUSE -dBATCH -sDEVICE=output.ps \
-c "<</PSPageOptions [ (<</Duplex false>> setpagedevice)
(<</Duplex true>> setpagedevice) (<</Duplex true>> setpagedevice) ]
/LockDistillerParams true>> setdistillerparams" -f input.pdf
Refer solution link for the above command: https://stackoverflow.com/a/64128881/13696415
Now, i have added duplex parameter for 2 pdf files and converted to 2 individual postscript, the problem is, when i merge these pdfs with Ghostscript, it losing page-level parameter which i passed while converting to ps. I tried below suggested answer to merge postscript.
https://stackoverflow.com/a/3445325/13696415
why its loosing added parameter while merging? How to retain page level parameter while merging? some one please help.

I can confirm the %%BeginPageSetup entries for setpagedevice are lost when merging 2 postscript files. Even the /LockDistillerParams fails to save the settings. Just running the postscript files again with the ghostscript ps2write device causes the output to drop the previous settings. I suspect ghostscript rewrites these every time if /PSPageOptions is missing to redo them. I don't know of a way to save the settings when merging.
I have tried two other techniques with good results.
(1) Merge the 2 postscript files and then use the ps2write device to write the desired settings to the combined postscript file.
gs -dBATCH -dNOPAUSE -sDEVICE=ps2write -sOutputFile=merged.ps -f file1.pdf file2.pdf
gs -dBATCH -dNOPAUSE -sDEVICE=ps2write -sOutputFile=merged-out.ps -c ' << /PSPageOptions [ (<</Duplex false>> setpagedevice) (<</Duplex true>> setpagedevice) (<</Duplex true>> setpagedevice) ] /LockDistillerParams true >> setdistillerparams ' -f merged.ps
(2) Use ghostscript to merge the 2 pdf files using the ps2write device and with the /PSPageOptions setdistillerparams included for an all in one operation. I have found this only works for certain pdf files. This doesn't work if the pdf files were generated with the cairographics library used by my Firefox for example even if redistilled with ghostscript.
My test here was for two 12 page well behaved pdf files. The results show the % page3 string at page 13 as desired. The strings can be changed to use the setpagedevice as needed:
gs -dBATCH -dNOPAUSE -sDEVICE=ps2write -sOutputFile=file1+2.ps -c '<< /PSPageOptions [(% page1)(% page2)(% page3)(% page4)(% page5)] /LockDistillerParams true >>setdistillerparams' -f file1.pdf file2.pdf
P.S. Please edit your original post to show the correct sDEVICE callout. And the backslash can be omitted as implied depending on the user.

Related

Ghostscript: ERROR: A pdfmark destination page x points beyond the last page y when use %003d

This command:
gs -sOutputFile=/destination/%003d.pdf \
-sDEVICE=pdfwrite \
-dBATCH \
-dNOPAUSE \
initial.pdf
give me the error:
GPL Ghostscript 9.26: ERROR: A pdfmark destination page 10 points beyond the last page 1.
for each page, but the same command without the 003 doesn't return any error:
gs -sOutputFile=/destination/%d.pdf \
-sDEVICE=pdfwrite \
-dBATCH \
-dNOPAUSE \
initial.pdf
You are producing each page of the input PDF file as a separate PDF file when you specify %d. Thus each destination output file only has one page.
Your input file has 'something' (could be an outline, a Link, a Dest or possibly something else) which points to page 10 in the original file. Ghostscript's PDF interpreter converts that to a pdfmark and emits it.
Now in both cases you should be emitting one file per page, so I would expect both command lines to give you an error because page 10 is, clearly, outside the range of pages in any file.
Its hard to see why %d instead of %003d doesn't give an error, I would expect that it should. However without the original PDF file to experiment with I can't tell what is going on. Your best bet, if you think this is a bug, is to open a bug report at https://bugs.ghostscript.com
You should also try the current version (9.50) the one you are using ia somewhat out of date.

Ghostscript to convert PDF 2 Back to PDF version 1.7

I need to build a PDF serverside by reading in a number of PDFs and inserting each page into a new multipage PDF. The problem is that the PDFs are provided in version 2.0 format, but my application can only read version 1.7. I would like to convert the version 2 files back into a version 1.7 file so that my application can read it.
I am using ghostscript version 9.27, and have tried several commands, but each time I end up with an empty PDF. Example:
/usr/local/bin/gs \
-q -dNOPROMPT \
-dBATCH \
-dDEVICEWIDTH=595 \
-dDEVICEHEIGHT=842 \
-sDEVICE=pdfwrite \
-dCompatibilityLevel=1.7 \
-sFileName=pdf-version-2.pdf \
-sOutputFile=fileout.pdf
There is no error, just an empty PDF. The "file" command does give the expected output "PDF document, version 1.7" but that's not much good when the file is blank. Any help greatly appreciated!
OK so I think the problem is your command line (as pointed out to me by one of my colleagues). You've specified -sFileName=pdf-version-2.pdf, which looks like you're trying to specify the input file.
There is no Ghostscript switch -sFileName, you specify the input filename(s) by simply putting the name on the command line. So you really want:
/usr/local/bin/gs \
-q -dNOPROMPT \
-dBATCH \
-dDEVICEWIDTH=595 \
-dDEVICEHEIGHT=842 \
-sDEVICE=pdfwrite \
-dCompatibilityLevel=1.7 \
-sOutputFile=fileout.pdf \
pdf-version-2.pdf
For reasons that were good 30 years ago, the Ghostscript command line switches are copied into the PostScript environment, where they can be accessed by PostScript programs. So while its true (and possibly the szource of your confusion) that some of the utility programs shipped with Ghostscript do use -sFileName, Ghostscript itself doesn't, it just defines a PostScript variable using that name, so that programs can read it.
Because you've specified BATCH and NOPROMPT, but haven't specified an input file, the interpreter starts up, erases the current page to white, then exits. Closing the pdfwrite device causes it to write out the current content of the page, which is, well, white, resulting in your empty PDF file.
The slightly modified command line above worked well for me, but as I noted in my comments specifying DEVICEWIDTHPOINTS and DEVICEHEIGHTPOINTS won't actually do anything.

ghostscript - remove only specific text in PDF file

Ghostscript allows to generate a new PDF without text from a source one with this easy script:
gs -o output_no_text.pdf -sDEVICE=pdfwrite -dFILTERTEXT input.pdf
My purpose is delete just one specific fixed string into the first PDF not all the text. Is there a parameter to set in order to do so?

When converting first page of a PDF into an image using Ghostscript, sometimes I get "extra" space. Why?

I am building a simple script which converts the first page of a PDF into an image using Ghostscript. Here is the command I use:
gs -q -o output.png -sDEVICE=pngalpha -dLastPage=1 input.pdf
This works beautifully with some PDFs, e.g. if I convert the first page of a PDF that looks like this:
I actually get this first page as an image and there aren't any problems.
But I have noticed that with some first pages of other PDFs, like the following:
With the same gs command, after the conversion, the .png image looks like this:
The problem is that I get this extra white space on the left inside the image when I convert that page, why does GhostsScript do this? Where does that extra blank white space come from?
Most likely, your PDFs do not use identical values for /MediaBox and for /CropBox. For details about these technical terms related to a page, see this illustration from the German Wikipedia:
In other words: the /CropBox values (if given) for a PDF page determines which (smaller) part of the overall page information (which is inside the /MediaBox) the PDF viewer should be made visible to the user (or to the printer).
Solution
To determine what are the different values for all the pages of your book(s), run this command:
pdfinfo -f 1 -l 1000 -box my.pdf
To see these values just for the first page, run
pdfinfo -l 1 -box my.pdf
For Ghostscript to give the results you want, add -dUseCropBox to your command line:
gs -q -o output.png -sDEVICE=pngalpha -dLastPage=1 -dUseCropBox input.pdf

Split multi-page PDF into JPG (or PNG...) files using ImageMagick

I'm facing a little problem with Image Magick, which I found a marvellous tool so far, but here it doesn't achieve what I expect (N.B: I work in Windows 7)
I read that, to split a 3 pages (for example) pdf file, you just have to do:
img2img My3pageFile.pdf SplittedImage.jpg
and then, ImageMAgick would automatically create SplittedImage-1.jpg, SplittedImage-2.jpg and SplittedImage-3.jpg.
Well instead of this, I obtain an error message like this: (let me hope you'll believe me if I say that I have no doubt here under that the file "benef.pdf" does exist on D:).
D:\>img2img benef.pdf benef.jpg
img2img: `%s': %s "gswin32c.exe" -q -dQUIET -dSAFER -dPARANOIDSAFE -dBATCH
-dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dEPSCrop -dAlignToPixels=0 -dGridFitTT=0
"-sDEVICE=pnmraw" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-g595x842" "-r72x72"
"-sOutputFile=C:/Users/ADM-A2~1/AppData/Local/Temp/magick-o3McMMZQ" "-fC:/Users
/ADM-A2~1/AppData/Local/Temp/magick-EOLT_ZO2" "-fC:/Users/ADM-A2~1/AppData/Local
/Temp/magick-mUWMMcc0".
img2img: Postscript delegate failed `benef.pdf'.
img2img: missing an image filename `benef.jpg'.
The answer is simply to install and download GhostScript at the following address, after what the instruction I gave at the beginning works perfectly well.
So here's the link:
http://downloads.ghostscript.com/public/