ghostscript converts my pdf to cmyk but not really - pdf

I run this command and the conversion is successful
gs \
-o converted-to-cmky.pdf \
-sDEVICE=pdfwrite \
-sProcessColorModel=DeviceCMYK \
-sColorConversionStrategy=CMYK \
-sColorConversionStrategyForImages=CMYK \
./original-srgb.pdf
But then I run
identify -verbose ./converted-to-cmky.pdf
And it still reports 'Colorspace: sRGB'. Any idea of what could be causing this?
Thank you.

Imagemagick identify is basically lying. It has no idea about individual objects in the PDF file. PDF can easily contain both CMYK and RGB objects, not to mention spot color plates etc., mixed all together.
Use either commercial Adobe Acrobat Pro preflight check to see the actual situation, or open the resulting PDF in Scribus and go to Edit-Colours and remove unused colours, to see if all colours in the PDF are actually CMYK or not. Probably they will be even though identify will think the PDF being RGB.

Related

How can I use Ghostscript to pre-process pdfs for older Kindles?

I have an old Kindle Dx. Owing to disabilities, I can't use tablets or other touch devices, and I transfer pdfs to the Kindle to read them. It requires pre-processing.
What is a good option to pre-process pdfs without rasterizing them?
[When rasterizing is acceptable:
k2pdfopt -mode copy for maps or for small text. This rasterizes, enhances contrast, and makes everything 1.4-compatible.
k2pdfopt -mode copy -dev dx for other works. This rasterizes to 800x1080, downsamples as needed, enhances contrast while making everything grayscale, and makes everything 1.4-compatible.
When rasterizing text is not acceptable:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -sstdout=%sstderr -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf if you want to preserve graphics. This makes minimal changes to make everything 1.4 compatible.
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 \
-g800x1080 -r150 -dPDFFitPage \
-dFastWebView -sColorConversionStrategy=RGB \
-dDownsampleColorImages=true -dDownsampleGrayImages=true -dDownsampleMonoImages=true -dColorImageResolution=150 -dGrayImageResolution=150 -dMonoImageResolution=300 -dColorImageDownsampleThreshold=1.0 -dGrayImageDownsampleThreshold=1.0 -dMonoImageDownsampleThreshold=1.0 \
-sstdout=%sstderr -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf if you want moderate downsampling. This re-rasterizes existing raster images to fit 800x1080 and makes everything 1.4 compatible.
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 \
-g800x1080 -r150 -dPDFFitPage \
-dFastWebView -sColorConversionStrategy=Gray \
-dDownsampleColorImages=true -dDownsampleGrayImages=true -dDownsampleMonoImages=true -dColorImageResolution=75 -dGrayImageResolution=75 -dMonoImageResolution=150 -dColorImageDownsampleThreshold=1.0 -dGrayImageDownsampleThreshold=1.0 -dMonoImageDownsampleThreshold=1.0 \
-sstdout=%sstderr -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf if you want more aggressive downsampling. This re-rasterizes raster images to fit 400x540, makes them grayscale, and makes everything 1.4 compatible. Low image quality, but usually still recognizable.
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dFILTERIMAGE -dFILTERVECTOR -sstdout=%sstderr -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf if you want to cut all graphics.
If using any of these options to pre-process for another device check its screen size in pixels. Don't worry too much about pixels per inch.]
[I.S. My goals are to fix pdfs so they 1. don't crash my Kindle, 2. don't freeze my Kindle or take too long to load each page, and 3. don't take up too much of the limited disk space on my Kindle. Preferably also 4. not rasterizing text, 5. not cutting out all images, which can sometimes lose tables, etc. and 6. not reflowing text, which will generally lose tabled. But I'm happy to downsample most images.]
[I.S. Note that I'm keeping copies of the originals. This is not a way to save disk space!]
For scanned pdfs, Willus's k2pdfopt is a great option. I've set up Mac Automator for
k2opt -mode copy -dev dx
or occasionally just -mode copy.
For pdf-born-pdfs, I'd rather not rasterize everything.
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -sstdout=%stderr
-dNOPAUSE -dQUIET -dBATCH
can usually convert files, so the Kindle Dx can open them, but the Kindle will still slow, freeze, or crash with some pages.
One option is to combine Ghostscript and Mutool as follows:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -sstdout=%stderr -dNOPAUSE -dQUIET -dBATCH to pre-process pdfs to remove passwords,
mutool clean -g -g -d -s -l to sort out the junk, and then
gs
-sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -sstdout=%stderr -dNOPAUSE -dQUIET -dBATCH again to get a smaller and faster pdf.
Note: I think Mutool's 3rd -g is the equivalent of Ghostscript's -dDetectDuplicateImages. Since it slows rendering down it may be better to do the opposite. I'm not sure how to set it to false. -dDetectDuplicateImages false? -uDetectDuplicateImages?
Note: I'm using gtime to time pdf rendering.
A single-step tool in a single application would help. And an image-reduction too would also help. Ghostscript's documentation is hard to follow.
For cleanup, as an alternative to running mutool:
-dFastWebView might help.
-dNOGC indicates that Ghostscript does garbage collection by default.
For image reduction:
-dPDFSETTINGS=/screen seems to work better in 9.50 than 9.23. /ebook might be better since it embeds all fonts.
-dFILTERIMAGE -dFILTERVECTOR also work better in 9.50 than 9.23, but are more drastic than I'd like.
A lot of settings seem to rely in input resolution and/or input page size.
-r seems to rely on input page size, rather than output page size. The Kindle Dx is 800 pixels by 1180 pixels.
-dDownScaleFactor reduces relative to input resolution.
-g800x1080 seems to crop pages, not shrink them.
I think -sDEVICE=pdfimage8 rasterizes everything, like k2pdfopt.
In some cases
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dFastWebView
-uDetectDuplicateImages -dPDFSETTINGS=/ebook -sstdout=%sstderr -dNOPAUSE -dQUIET -dBATCH yields larger and slower files than just -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -sstdout=%sstderr -dNOPAUSE -dQUIET -dBATCH
... I'm not sure what to make of these results.
You've asked an awful lot in here, which makes it rather difficult to read and answer cogently. You haven't really made it clear exactly what it is you want to achieve (you also haven't said what version of GS and MuPDF you are using).
Here are some points;
You don't need to 'clean out the junk' from PDF files produced by Ghostscript, these rarely have anything which can be removed, that's one reason people run PDF files through GS+pdfwrite (despite my saying constantly its a bad idea).
Using the -g switch with Mutool twice doesn't (AFAIK) do anything extra, but adding -d decompresses the files. You can have Ghostscript produce uncompressed PDF files too, use -dCompressPages=false -dCompressFonts=false -dCompressStreams=false.
When you pass your PDF through pdfwrite, then MuPDF, then pdfwrite again, you are risking quality degradation at every step, and the intermediate MUPDF step is unlikely to achieve anything. Most likely what you are doing is reducing the compression (and quality) of any JPEG compressed images, I doubt much else of use is happening.
I can't think why you'd want to not detect duplicate images, it really just makes the file bigger but if you want to you use the switch the same way as all the other GS switches; -dDetectDuplicateImages=false. Note this won't change the processing speed (and generally pdfwrite doesn't do rendering, but perhaps you mean on the target device...), the detection is done by applying an MD5 filter to every image as it is read, then comparing the MD5 hashes. Switching that off doesn't stop the MD5 it just stops the comparison.
If you find Ghostscript's documentation hard to follow, then use the Adobe documentation for distillerparams, that's where the majority of the pdfwrite settings come from (ie blame Adobe for this ;-)
-dFastWebView is (IMO) totally pointless, its there purely for compatibility with Adobe, and because a lot of people won't accept that its useless and insist on it. All it does is speed up loading of the first page of a PDF file, by PDF consumers which support it (which is practically none). And to do this it makes the file slightly bigger and more complicated.
Do NOT use -dNOGC, I keep telling people not to do this, its a debugging tool, it has no practical value in production other than to potentially make Ghostscript use more memory. Everything else you hear about it is cargo cult.
-r has nothing to do with the media szie at all, and does (more or less) nothing with pdfwrite. It sets the resolution of a page when rendering. Since you don't want to render to an image, setting the resolution is not a useful thing to do.
No pdfwrite settings rely on the "input resolution" because PDF (and PostScript) files don't have a resolution, they are vector page descriptions.
-dDownscaleFactor is a switch which only applies to the downscaling devices; tiffscaled and friends, which are rendering devices, it has no effect at all on pdfwrite.
Setting a fixed media size (using -g) does indeed rely on the resolution (because its specified in device pixesl) and does indeed only alter the media size, not the content. If you want to rescale the content to fit the new media, then you need to use -dFitPage. I can't really see why you would do that. Note that it doesn't affect the content of a PDF file (unless its a rendered image), it just makes all the numberic values smaller.
The pdfimage devices do indeed produce a PDF file where the entire content is an image; hence the name....
Now, if you could define what you actually want to achieve, I could make some suggestions.....
[EDIT]
image downsampling
Firstly there are three controls which turn this feature on/off altogether;
-dDownsampleMonoImages, -dDownsampleGrayImages and -dDownsampleColorImages. Assuming you don't select a PDFSETTINGS (I would recommend you do not) these are all initially false. If you want to downsample any images you need to set the relevant mono/gray/color switch to true.
Once downsampling is enabled then you need to set the relevant ImageResolution and DownsamplingThreshold, there are again switches for each colour depth.
Now although PDF files don't have a resolution the images have an effective resolution, but its not easy to calculate (actually without a lot of effort its impossible). Its the number of image samples in the bitmap in each direction, divided by the area of the media covered by the image.
As an example if I have an image 100x100 samples, and that is placed on the page in a 1 inch square, then the resolution of the image is 100 dpi. If I then scale the image up so that it covers 2 inches square (but don't change the image data) then its 50 dpi.
So you need to decide what resolution looks OK on your device. You then set -dColorImageResolution=, -dMonoImageResolution, -dGrayImageResolution.
That's the 'target' resolution. But if the image is already close to that it can be wasteful to process it, so the Downsampling threshold is consulted. The actual resolution of the image in the input has to be the target resolution times the threshold, or more, to be reduced for output.
If we consider, for example, a target resolution of 300 and a threshold of 1.5 then the actual resolution of an image in the input file would have to exceed 450 dpi to be considered for downsampling.
Obviously you can set the threshold to 1.0 eg -dColorImageDownsampleThreshold=1.0
Finally there is the downsampling type, this is the filter used to create the lower resolution image from the higher. The simplest is /Subsample; basically throw away enough lines and columns until we reach the required resolution (this is only filter available for monochrome imsages, as all the others would change the colour depth). Then there's /Average which averages the value in each direction, effectively a bilinear filter. Finally there's /Bicubic which probably does the 'best' job but will be the slowest to process.
On top of all that you can choose the Image Filter (the compression filter) used to write the image data. We don't support JPXEncode in the AGPL version of Ghostscript and pdfwrite. That leaves you /CCITTFaxEncode (for monochrome) DCTEncode (JPEG) and FlateEncode (basically Zip compression). That's MonImageFilter, GrayImageFilter and ColorImageFilter.
If you want to use these you must first set AutoFilterGrayImages to false and/or AutoFilterColorImages to false, because if these are true the pdfwrite device will choose a compression method by looking to see which one compresses most. For Gray and Color images this will almost certainly be JPEG.
Final point is that linework (vector data) cannot be selectively rendered; either everything is rendered or everything is maintained 'as it was'. The only time (in general) that pdfwrite renders content is when transaprecny is present and the output CompatibilityLevel doesn't support transparency (1.3 or below). There are exceptions but they are quite uncommon.
You might want to consider setting the ColorConversionStrategy to either /DeviceRGB or /DeviceGray. I've no idea if you are using colour or grayscale devices, but if they are grayscale creating a gray PDF file would reduce the size and processing significantly. Creating an RGB file for colour devices probably makes sense too, in case the input is CMYK.

Ghostscript - PDF file, with multiple trays, and with a lot of problems

I don't speak good english, but I hope anyone can help me on this one...
I spent several days on this but I can't figure out on my own. Here's the deal:
I Have 4000+ PDF documents, with TrimBox margins, each one with 16 pages, color.
I needed to batch print them:
Print pages 1-10 using the paper on tray 3;
Print pages 11-15 using the paper on tray 4, two copies uncollated.
Print page 16 using the paper on tray 3.
I'm using an Kyocera 7550ci, the PPD is here.
I have installed GhostScript 9.19, and also gsview with gsprint. Windows 7 SP1.
When I first tried to do anything at all, didn't know ghostscript or how to use it, but doing some reading I managed to "kind of" solve the problem. I duplicated the printer on Windows control panel, setted each one with the configurations I wanted and did the following command on GSPRINT:
gsprint -printer "Kyocera TASKalfa 7550ci KX" -color -dUseTrimBox -dFitPage -from 1 -to 10 s_file0001.pdf
gsprint -printer "ALT Kyocera" -color -dUseTrimBox -dFitPage -from 11 -to 15 -copies 2 s_file0001.pdf
gsprint -printer "Kyocera TASKalfa 7550ci KX" -color -dUseTrimBox -dFitPage -from 16 -to 16 s_file0001.pdf
(I setted TASKalfa 7550ci default driver to use tray 3, and ALT Kyocera to use tray 4 and uncollate).
It worked, but was painfully slow both to Windows process, and the printer to process. I soon realised GSPRINT is slow because it has to render the whole image to bitmap, and started to see if I could use pure GhostScript to do the work.
gswin32c -dBATCH -dNOPAUSE -q -dUseTrimBox -dFitPage -dFirstPage=1 -dLastPage=10 -sDEVICE=mswinpr2 -sOutputFile="%printer%Kyocera TASKalfa 7550ci KX" -f test.pdf
gswin32c -dBATCH -dNOPAUSE -q -dUseTrimBox -dFitPage -dFirstPage=11 -dLastPage=15 -sDEVICE=mswinpr2 -sOutputFile="%printer%ALT Kyocera" -f test.pdf
gswin32c -dBATCH -dNOPAUSE -q -dUseTrimBox -dFitPage -dFirstPage=16 -dLastPage=16 -sDEVICE=mswinpr2 -sOutputFile="%printer%Kyocera TASKalfa 7550ci KX" -f test.pdf
But I'm still with alot of problems... I'm frustrated that I couldn't get it to work even trying really hard to read manuals and search around.
Using mswinpr2 is still really slow, gives me wrong colors, and can't figure out how to select the paper tray.
Using any included PCL drivers, altrought was fast and managed to select the correct tray using dMediaPosition, there's only Black and white drivers...
Using pdfwrite, don't correct scale the TrimBox to fit the whole page, and can't select correct tray.
Using ps2write, can't select tray and messes up with the page position.
I'm lost. someone can give me some directions? Also, there's some way to send everythign as one file to the printer?
Thank you ALL!
---EDIT---
Thank you both for the answers!
I managed to make it work:
gswin32c -dBATCH -dNOPAUSE -q -dPDFFitPage -dUseTrimBox -dFirstPage=1 -dLastPage=10 \
-dMediaPosition=7 -sDEVICE=pxlcolor \
-sOutputFile="%printer%Kyocera TASKalfa 7550ci KX" -f in.pdf
gswin32c -dBATCH -dNOPAUSE -q -dPDFFitPage -dUseTrimBox -dFirstPage=11 -dLastPage=15 \
-dMediaPosition=5 -sDEVICE=pxlcolor -dNumCopies=2 \
-sOutputFile="%printer%Kyocera TASKalfa 7550ci KX" -f in.pdf
gswin32c -dBATCH -dNOPAUSE -q -dPDFFitPage -dUseTrimBox -dFirstPage=16 -dLastPage=16 \
-dMediaPosition=7 -sDEVICE=pxlcolor \
-sOutputFile="%printer%Kyocera TASKalfa 7550ci KX" -f in.pdf
The only thing is that the page doesn't scale correctly on pxlcolor (it does on ljet4, but it's black and white).
I'm almost there! Thanks ^^. If anyone knows about this problem, I would appreciate.
You have asked a lot of questions, all at once, that's not really a good way to get helpful answers. In addition you haven't really been too clear about some of the problems.
1) If you want to use the TrimBox for the media size then you have to tell Ghostscript you want to use the TrimBox, you do that by -dUseTrimBox, no matter what device you want to use.
2) The mswinpr2 device works by creating a Windows DeviceContext for the printer, rendering the input to a (RGB) bitmap, then blitting the bitmap to the DeviceContext and telling it to print itself. This is slow because it will involve rendering a large bitmap (size dependent on printer resolution) to memory and then sending that large bitmap to the device.
Its one great advantage is that it will work no matter what printer you have.
GSPrint uses a 'similar' but somewhat different technique and is claimed to be faster.
Note that both these devices use the default settings of the printer which probably won't work for your complex needs.
Colour management is, of course, up to Windows in this case, but if your original PDF is specified in say CMYK then this will involve conversions CMYK->RGB->CMYK which is bound to cause colour differences.
3) There are colour PCL devices available in Ghostscript, eg the cdeskjet device.
4) pdfwrite will use the TrimBox if you select -dUseTrimBox. Since it creates a PDF file its rather hard to see how it could 'select correct tray'. If you are sending the PDF file to the printer, then you could simply have started with the original PDF file. PDF files cannot contain device-dependent criteria, such as tray selection.
5) ps2write in its current incarnation will allow you to add device-specific operations, see ghostpdl/doc/VectorDevices.htm (also available on the ghostscript.com website), section 6.5 "PostScript file output" and look for the PSDocOptions and PSPageOptions keys. You could use the PSPageOptions array to introduce individual media selection commands to each page. I have no idea what you mean by 'messes up the page position', however yet again if you do not select -dUseTrimBox then it will not be using the TrimBox........
Oh, and if you want to 'scale the TrimBox to fit the whole page' (which you only mention regarding pdfwrite) then you will have to set up a fixed media of the size you want the page scaled to (-dFIXEDMEDIA, -dDEVICEHEIGHTPOINTS= and -dDEVICEWIDTHPOINTS=), select -dUseTrimBox and -dPDFFitPage.
There is no easy way to do this. While PDF itself does not provide a facility to switch the paper trays your need to convert this stream to another PDL. PostScript is a good choice.
While converting to PostScript you can inject PostScript tray switching commands like those found in PPD:
<< /ManualFeed false >> setpagedevice statusdict begin 5 setpapertray end
On Windows platform you have choices on the implementation:
Alter the PPD to make it injecting the PostScript code before every page. The code should maintain a page counter and execute tray switching commands accordingly.
Buy a third-party software providing this capability.
Extend the printer driver with the DLL injecting the PostScript code.
The first may not work with your printer diver. Then you can try to inject a PostScript code at the beginning of the job. The code should override showpage extending it with the capability described in the first option.
The same code overloading showpage you could inject in PostScript interpreter startup sequence if you had an access to internals of the controller.

How to resize/resample bitmap images within pdfs and maintain vector overlays

My workflow is usually as follows:
I annotate bitmap images using inkscape with vector elements (text, lines, etc.). I export a pdf from inkscape and include it in my pdflatex source (includegraphics...).
Somehow, I don't want to downscale the bitmaps prior to embedding in inkscape to keep the svg files universal. On the other hand, the resulting pdfs after pdflatex are often unnecessary huge due to the images in full (ridiculous) resolution. Unfortunately, the pdf export of inkscape does not support a downsampling of bitmaps right away (this is, however, often discussed). So I tried to use ghostscript to reduce the pdfs from inkscape before running pdflatex. However, all my vector annotations are rendered in this process which is what I want to avoid and what this question is about.
I used ghostscript like this (found this in different flavors but nothing worked):
gs -dNOPAUSE -dBATCH -dSAFER \
-sDEVICE=pdfwrite \
-dCompatibilityLevel=1.3 \
-dPDFSETTINGS=/ebook \
-dEmbedAllFonts=true \
-dSubsetFonts=true \
-dColorImageDownsampleType=/Bicubic \
-dColorImageResolution=200 \
-dGrayImageDownsampleType=/Bicubic \
-dGrayImageResolution=400 \
-dMonoImageDownsampleType=/Bicubic \
-dMonoImageResolution=1200 \
-sOutputFile=out.pdf \
in.pdf
So I am looking for some help/ideas to get my image pdfs smaller yet still with the vector art in vector format.
1) Don't use -dPDFSETTINGS unless you are very sure what it will do. In your cas,e really don't use it.
2) 200 ,400 and 1200 (!) dpi are still very high resolutions, you could try still lower.
3) You haven't set -dDownsampleColorImages, -dDownsampleGrayImages or -dDownsampleMonoImages. So the changes to the downsampling type and resolution won't actually do anything. (I know this is ridiculous, blame Adobe for the settings which we have to mimic....)
If you can supply an example file I can test this, but my suspicion is that '3' is your problem. You might also want to look at the ColorImageDownsampleThreshold (and Gray/Mono) switch.

PDF Optimization Acrobat vs. Ghostscript

I have a PDF file that I would like to optimize. I am receiving the file from an outside source so I don't have the means to recreate it from the beginning.
When I open the file in Acrobat and query the resources, it says that the fonts in the file take up 90%+ of the space. If I save the file as postscript and then save the postscript file to an optimized PDF, the file is significantly smaller (upwards of 80% smaller) and the fonts are still embedded.
I am trying to recreate these results with ghostscript. I have tried various permutations of options with pswrite and pdfwrite but what happens is when I do the initial conversion from PDF to Postscript, the text gets converted to an image. When I convert back to PDF the font references are gone so I end up with a PDF file that has 'imaged' text rather than actual fonts.
The file contains 22 embedded custom Type1 fonts which I have. I have added the fonts to the ghostscript search path and proved that ghostscript can find them with:
gs \
-I/home/nauc01
-sFONTPATH=/home/nauc01/fonts/Type1 \
-o 3783QP.pdf \
-sDEVICE=pdfwrite \
-g5950x8420 \
-c "200 700 moveto" \
-c "/3783QP findfont 60 scalefont setfont" \
-c "(TESTING !!!!!!) show showpage"
The resulting file has the font correctly embedded.
I have also tried using ghostscript to go from PDF to PDF like this:
gs \
-sDEVICE=pdfwrite \
-sNOPAUSE \
-I/home/nauc01 \
-dBATCH \
-dCompatibilityLevel=1.4 \
-dPDFSETTINGS=/printer \
-CompressFonts=true \
-dSubsetFonts=true \
-sOutputFile=output.pdf \
input.pdf
but the output is usually larger than the input and I can't view the file in anything but ghostscript (adobe reader gives "Object label badly formatted").
I can't provide the original file because they contain confidential information but I will try to answer any questions that need to be answered regarding them.
Any ideas? Thanks in advance.
Don't use pswrite. As you've discovered this will render text. instead use the ps2write device which retains fonts and text.
You don't say which version of Ghostscript you are using but I would recommend you use a recent one.
One point; Ghostscript isn't 'optimising' the PDF the way Acrobat does, its re-creating it. The original PDF is fully interpreted to produce a sequence of operations that mark the page, pdfwrite (and ps2write) then make a new file which only has those operations inside.
If you choose to subset fonts, then only the required glyphs will be included. If the original PDF contains extraneous information (Adobe Illustrator, for example, usually embeds a complete copy of the .ai file) then this will be discarded. This may result in a smaller file, or it may not.
Note that pdfwrite does not support compressed xref and some other later features at present, so some files may well get bigger.
I would personally not go via ps2write, since this just adds another layer of prcoessing and discarding of information. I would just use pdfwrite to create a new PDF file. If you find files for which this does not work (using current code) then you should raise a bug report at http://bugs.ghostscript.com so that someone can address the problem.
You might want to try the Multivalent Compress tool. It has an (experimental) option to subset embedded fonts that might make your PDF much smaller. It also contains a lot of switches that allow for better compression, sometimes at the cost of quality (JPEG compression of bitmaps, for example).
Unfortunately, the most recent version of Multivalent does no longer include the tools. Google for Multivalent20060102.jar, that version still includes them. To run Compress:
java -classpath /path/to/Multivalent20060102.jar tool.pdf.Compress [options] <pdf file>

Using ImageMagick or Ghostscript (or something) to scale PDF to fit page?

I need to shrink some large PDFs to print on an 8.5x11 inch (standard letter) page. Can ImageMagick/Ghostscript handle this sort of thing, or am I having so much trouble because I'm using the wrong tool for the job?
Just relying on the 'shrink to page' option in client-side print dialogs is not an option, as we'd like for this to be easy-to-use for the end users.
I would not use convert. It uses Ghostscript in the background, but is much slower. I'd use Ghostscript directly, since it gives me much more direct control (and also some control over settings which are much more difficult to achieve with convert). And for convert to work for PDF-to-PDF conversion you'll have Ghostscript installed anyway:
gs \
-o /path/to/resized.pdf \
-sDEVICE=pdfwrite \
-dPDFFitPage \
-r300x300 \
-g2550x3300 \
/path/to/original.pdf
The problem with using ImageMagick is that you are converting to a raster image format, increasing file size and decreasing quality for any vector elements on your pages.
Multivalent will retain the vector information of the PDF.
Try:
java -cp Multivalent.jar tool.pdf.Impose -dim 1x1 -paper "8.5x11in" myFile.pdf
to create an output file myFile-up.pdf
ImageMagick's mogrify/convert commands will indeed do the job. Stephen Page had just about the right idea, but you do need to set the dpi of the file as well, or you won't get the job done.
Assuming you have a file that's 300 dpi and already the same aspect ratio as 8.5 x 11 the command would be:
// 300dpi x 8.5 -2550, 300dpi x 11 -3300
convert original.pdf -density "300" -resize "2550x3300" resized.pdf
If the aspect ratio is different, then you need to do some slightly trickier cropping.
The Ghostscript approach worked well for me. (I moved my file from my Windows PC to a Linux computer and ran it there.) I made one small change to the Ghostscript command because the Ghostscript resize command above completely fills an 8.5 by 11 inch page. My printer cannot print to the edge, though, so several milllimeters along each page edge were lost. To overcome that problem, I scaled my PDF document to 0.92 of a full 8.5 by 11 inches. That way I saw everything centered on the page and had a slight margin. Because 0.92 * (2550x3300) = (2346x3036), I ran the following Ghostscript command:
gs -sDEVICE=pdfwrite \
-dPDFFitPage \
-r300x300 \
-g2346x3036 \
/home/user/path/original.pdf \
-o /home/user/path/resized.pdf
If you use Insert > Image... in LibreOffice Writer to insert a PDF, you can use direct manipulation or its Image Properties to resize and reposition the PDF, and when you File > Export as... PDF the PDF remains vectors and text. Interestingly when I did this with a PDF invoice the PDF exported from LO is smaller than the original, but the Linux pdfimages command-line utility suggests LO preserves any raster images within the original PDF.
However, you want something easier-to-use for your end users than the print dialog's "Shrink to page" option. There are tools like Adobe Acrobat that lay out PDFs to form print jobs that are PDFs; I don't know which ones have a simple "Change the bounding box and scale to letter-size". Surprisingly the do-it-all qpdf tool lacks this feature.