Print multiple PDF page ranges - pdf

I have a PDF with 200+ editable pages and need to hardcode print to PDF them into smaller PDF files (ie page 1-2, 3-8, 9, 10-11, 12-14, etc..).
Is there a way to automate this since I do this exercise each month? Right now I have to manually print each sub section one at a time.

You can use Ghostscript to copy a range of pages from a PDF file to another.
For example, to write pages 3-8 from input.pdf to output.pdf you could run the following from the command prompt, using the command line options to specify the first and last page to process.
gswin64c.exe -sDEVICE=pdfwrite -dFirstPage=3 -dLastPage=8 -o output.pdf input.pdf

Related

Use Ghostscript to convert each page of a PDF to images and the output is still PDF

I know that Adobe Acrobat Reader DC can select the Microsoft Print to PDF printer to output to a PDF file with Print As Image checked in the Advanced Print Setup dialog. However, I want to use a command to do this. I tried the following command, as a result it failed to convert each page to images (Note the output file is still PDF).
gs -o 0.999.watermask.compact.screen.pdf -sDEVICE=pdfwrite -dDetectDuplicateImages=true -dPDFSETTINGS=/screen 0.999.watermask.pdf
References
7.4 PDF file output
iText 7 iText 7 for Java represents the next level of SDKs for developers that want to take advantage of the benefits PDF can bring.
itext-rups-7.1.14.jar iText RUPS iText® 7.1.14 ©2000-2020 iText Group NV (AGPL-version)
Your -switches include -dDetectDuplicateImages=true which under the circumstances should be superfluous and the device selection can be from one of four as pointed out by KenS.
gs -o 0.999.watermask.compact.screen.pdf -sDEVICE=pdfimage32 -dPDFSETTINGS=/screen 0.999.watermask.pdf
If you want to emulate MS Print As Image PDF on Windows you would find the result in some ways inferior (and often many times bigger). But for comparison it would be,
NOTE:- "%%printer%%... is for a batch file for a command line use "%printer%...
gswin64c.exe -sDEVICE=mswinpr2 -dNoCancel -o "%%printer%%Microsoft Print to PDF" -dPDFSETTINGS=/screen -f "0.999.watermask.pdf"

Ghostscript: ERROR: A pdfmark destination page x points beyond the last page y when use %003d

This command:
gs -sOutputFile=/destination/%003d.pdf \
-sDEVICE=pdfwrite \
-dBATCH \
-dNOPAUSE \
initial.pdf
give me the error:
GPL Ghostscript 9.26: ERROR: A pdfmark destination page 10 points beyond the last page 1.
for each page, but the same command without the 003 doesn't return any error:
gs -sOutputFile=/destination/%d.pdf \
-sDEVICE=pdfwrite \
-dBATCH \
-dNOPAUSE \
initial.pdf
You are producing each page of the input PDF file as a separate PDF file when you specify %d. Thus each destination output file only has one page.
Your input file has 'something' (could be an outline, a Link, a Dest or possibly something else) which points to page 10 in the original file. Ghostscript's PDF interpreter converts that to a pdfmark and emits it.
Now in both cases you should be emitting one file per page, so I would expect both command lines to give you an error because page 10 is, clearly, outside the range of pages in any file.
Its hard to see why %d instead of %003d doesn't give an error, I would expect that it should. However without the original PDF file to experiment with I can't tell what is going on. Your best bet, if you think this is a bug, is to open a bug report at https://bugs.ghostscript.com
You should also try the current version (9.50) the one you are using ia somewhat out of date.

ghostscript - remove only specific text in PDF file

Ghostscript allows to generate a new PDF without text from a source one with this easy script:
gs -o output_no_text.pdf -sDEVICE=pdfwrite -dFILTERTEXT input.pdf
My purpose is delete just one specific fixed string into the first PDF not all the text. Is there a parameter to set in order to do so?

Reverse white and black colors in a PDF

Given a black and white PDF, how do I reverse the colors such that background is black and everything else is white?
Adobe Reader does it (Preferences -> Accessibility) for viewing purposes only in the program. But does not change the document inherently such that the colors are reversed also in other PDF readers.
How to reverse colors permanently?
You can run the following Ghostscript command:
gs -o inverted.pdf \
-sDEVICE=pdfwrite \
-c "{1 exch sub}{1 exch sub}{1 exch sub}{1 exch sub} setcolortransfer" \
-f input.pdf
Acrobat will show the colors inverted.
The four identical parts {1 exch sub} are meant for CMYK color spaces and are applied to C(yan), M(agenta), Y(ellow) and (blac)K color channels in the order of appearance.
You may use only three of them -- then it is meant for RGB color spaces and is applied to R(ed), G(reen) and B(lue).
Of course you can "invent" you own transfer functions too, instead of the simple 1 exch sub one: for example {0.5 mul} will just use 50% of the original color values for each color channel.
Note: Above command will show ALL colors inverted, not just black+white!
Caveats:
Some PDF viewers won't display the inverted colors, notably Preview.app on Mac OS X, Evince, MuPDF and PDF.js (Firefox PDF Viewer) won't. But Chrome's native PDF viewer PDFium will do it, as well as Ghostscript and Adobe Reader.
It will not work with all PDFs (or for all pages of the PDF), because it is also dependent on how exactly the document's colors are defined.
Update
Command above updated with added -f parameter (required) before the input.pdf. Sorry for not noticing this flaw in my command line before. I got aware of it again only because some good soul gave it its first upvote today...
Additional update: The most recent versions of Ghostscript do not require the added -f parameter any more. Verified with v9.26 (may also be true even with v9.25 or earlier versions).
Best method would be to use "pdf2ps - Ghostscript PDF to PostScript translator", which convert the PDF to PS file.
Once PS file is created, open it with any text editor & add {1 exch sub} settransfer before first line.
Now "re-convert" the PS file back to PDF with same software used above.
If you have the Adobe PDF printer installed, you go to Print -> Adobe PDF -> Advanced... -> Output area and select the "Invert" checkbox. Your printed PDF file will then be inverted permanently.
None of the previously posted solutions worked for me so I wrote this simple bash script. It depends on pdftk and awk. Just copy the code into a file and make it executable. Then run it like:
$ /path/to/this_script.sh /path/to/mypdf.pdf
The script:
#!/bin/bash
pdftk "$1" output - uncompress | \
awk '
/^1 1 1 / {
sub(/1 1 1 /,"0 0 0 ",$0);
print;
next;
}
/^0 0 0 / {
sub(/0 0 0 /,"1 1 1 ",$0);
print;
next;
}
{ print }' | \
pdftk - output "${1/%.pdf/_inverted.pdf}" compress
This script works for me but your mileage may vary. In particular sometimes the colors are listed in the form 1.000 1.000 1.000 instead of 1 1 1. The script can easily be modified as needed. If desired, additional color conversions could be added as well.
For me, the pdf2ps -> edit -> ps2pdf solution did not work. The intermediate .ps file is inverted correctly, but the final .pdf is the same as the original. The final .pdf in the suggested gs solution was also the same as the original.
Cross Platform try MuPDF
Mutool draw -I -o out.pdf in.pdf [range of pages]
It should permanently change colours in many viewers
Later Edit
A sample file that did not reverse was one with linework only (no image) and the method needed was to save the graphics as inverted image then reuse that to build a replacement PDF, however beware converting the whole pages to image will make any searchable text just simply unsearchable pixels thus would need to be run with the OCR active on rebuild.
The two commands needed will be something like (%4d means numbers for images start output0001)
mutool draw -o output%4d.png -I input.pdf
For Linux users the folowing second pass should work easily:-
mutool convert -O compress -o output.pdf output*.png
For windows users you will for now (v1.19) need to combine by scripting or use groups
mutool convert -O compress -o output.pdf output0001.png output0002.png output0003.png
next version may include an #filelist option see https://bugs.ghostscript.com/show_bug.cgi?id=703163
This is probably just a frontend for the ghostscript command Kurt Pfeifle posted, but you could also use imagemagick with something like:
convert -density 300 -colorspace RGB -channel RGB -negate input.pdf output.pdf

When converting first page of a PDF into an image using Ghostscript, sometimes I get "extra" space. Why?

I am building a simple script which converts the first page of a PDF into an image using Ghostscript. Here is the command I use:
gs -q -o output.png -sDEVICE=pngalpha -dLastPage=1 input.pdf
This works beautifully with some PDFs, e.g. if I convert the first page of a PDF that looks like this:
I actually get this first page as an image and there aren't any problems.
But I have noticed that with some first pages of other PDFs, like the following:
With the same gs command, after the conversion, the .png image looks like this:
The problem is that I get this extra white space on the left inside the image when I convert that page, why does GhostsScript do this? Where does that extra blank white space come from?
Most likely, your PDFs do not use identical values for /MediaBox and for /CropBox. For details about these technical terms related to a page, see this illustration from the German Wikipedia:
In other words: the /CropBox values (if given) for a PDF page determines which (smaller) part of the overall page information (which is inside the /MediaBox) the PDF viewer should be made visible to the user (or to the printer).
Solution
To determine what are the different values for all the pages of your book(s), run this command:
pdfinfo -f 1 -l 1000 -box my.pdf
To see these values just for the first page, run
pdfinfo -l 1 -box my.pdf
For Ghostscript to give the results you want, add -dUseCropBox to your command line:
gs -q -o output.png -sDEVICE=pngalpha -dLastPage=1 -dUseCropBox input.pdf