Convert PDF to Tiff With CMYK and Transparency - pdf

I’m attempting to convert from PDF to TIFF and maintain both the CMYK color space and the transparent areas in the PDF. There are many posts out there say “just use PNG” but for my application the CMYK color space is a must and PNG doesn’t support CMYK. I started working on this using Imagemagick (IM) and had limited success but had better results so far using Ghostscript directly (which is what IM uses under the hood anyway). The problem that continually occurs is that the transparent areas from the PDF are filled in with white.
Context:
Ubuntu 16.04.3 LTS (AWS), Ghostscript 9.23, ImageMagick 7.0.7-36
I created a test PDF that has a transparent background and overlapping blocks of solid and semi-transparent colors:
trans-test.pdf
I’ve verified that the PDF has the correct transparent areas. If I use PNG (and thus the RGB colorspace) I get the resulting image which has the matching transparent areas as the PDF but, alas, the wrong colorspace:
output.png
(remember: web browsers will show it with a white background but it’s really transparent)
For conversions I’ve tried variations of the following commands (simplified for posting):
Imagemagick:
convert -depth 8 -colorspace CMYK trans-test.pdf output.tif
Ghostscript:
gs -sDEVICE=tiff32nc -sOutputFile=output_gs.tif -r144 trans-test.pdf
In IM, the delegate for processing CMYK PDFs uses the pamcmyk32 device. Some suggest changing that to pngalpha but that forces an unwanted change to the RGB color space. Some suggest converting first to PNG then back to CMYK but that results in color data loss.
Using Ghostscript directly none of the device options for TIFF have any options for transparency. After digging around for a while I found an old post reply from a dev at Artifex (KenS) stating that “As for TIFF, there is no support in GS for making the unmarked areas transparent”.:
Conversion...does not maintain transparency
It was disheartening but it was from 2011 so I’m holding out hope that there is some workaround for this issue by now. I’m searching for some configuration change to Ghostscript that will enable me to set all unmarked areas to transparent, or get it to start with a page erased to transparent rather than white.
The TIFF file format supports both CMYK color space and transparency so there must be a way to get both in the same file. Any insight into how to get both inside a TIFF would be welcome at this point. Thanks for reading.

Some creative solutions would be needed to get around the delegate limitations -- as pointed out in the comments.
I would suggest extracting the transparency to an intermediate mask, and reapply it after enabling the CMYKA data channels.
# Create transparent mask (we don't care about colorspace, just grab the alpha channel values)
convert -depth 8 -colorspace sRGB trans-test.pdf -alpha Extract mask.png
# Apply mask _after_ enabling alpha channel
convert \( \
-depth 8 \
-colorspace CMYK \
trans-test.pdf \
-alpha Activate \
\) \
mask.png -compose CopyAlpha -composite output.tif
The -alpha Activate turn on the alpha channel, but there's no data so everything is transparent. Following mask.png -compose CopyAlpha -composite populates the alpha channel with the values extracted from the previous operation.

I compared my solution to that of emcconville. His approach should be a good one, but my profile solution seems to match colors better in saved tif files displayed in Mac OSX Sierra Preview, GraphicConverter and Photoshop. Commands below are unix syntax in IM 7.0.7.37 Q16 HDRI and LIBTIFF Version 4.0.9 Ghostscript 9.23. You can see your versions from magick -list format.
Input:
trans-test.pdf
My solution:
magick -depth 8 -colorspace sRGB trans-test.pdf \
-profile /Users/fred/images/profiles/sRGB.icc \
-profile /Users/fred/images/profiles/USWebCoatedSWOP.icc \
trans-test_profile.tif
trans-test_profile.tif
emcconville's solution:
magick -depth 8 -colorspace sRGB trans-test.pdf -alpha Extract mask.png
magick \( -depth 8 -colorspace CMYK trans-test.pdf \) \
mask.png -alpha off -compose CopyOpacity -composite \
trans-test_mask.tif
trans-test_mask.tif
Note that -compose copyOpacity or copyAlpha generally requires -alpha off. Therefore, I do not see any reason to add -alpha activate.
Here are my delegates:
Version: ImageMagick 7.0.7-37 Q16 x86_64 2018-05-30 https://www.imagemagick.org
Copyright: © 1999-2018 ImageMagick Studio LLC
License: https://www.imagemagick.org/script/license.php
Features: Cipher DPC HDRI Modules OpenMP
Delegates (built-in): bzlib cairo djvu fftw fontconfig freetype gslib gvc jbig jng jp2 jpeg lcms lqr ltdl lzma openexr png ps raw rsvg tiff webp x xml zlib

Related

ImageMagick convert adds whitespace when converting PDF to PNG

I'm using ImageMagick to convert the following PDF to an PNG file.
Click here to download the PDF from IMSLP (Permalink if the direct download is broken)
In a PDF viewer it looks nice:
but when converting with convert -density 300 -background white -alpha off -alpha remove file.pdf /tmp/file.png
the image gets a large white margin:
I do not want to trim the image afterwards, I just want ImageMagick to somehow respect the view-port or however that viewing information is being encoded in the PDF. Does anyone know which command-line parameter might enable this behavior?
Edit 10.03.2022: I'm using ImageMagick 7.1.0.16 with Ghostscript 9.55.0 inside an Alpine Linux docker image.
I do not get your extra margin in ImageMagick 6.9.12-42 using Ghostscript 9.54. But changing the density does not seem to have any effect.
convert -density 300 -background white file.pdf[1] x2.png
The issues may be a malformed PDF. How was it created? Also what version of Ghostscript are you using? It could be a GS version issue.
If this was a scanned PDF that is a raster image in a vector PDF shell, then you could just use pdfimages to extract the raster files. See https://manpages.debian.org/testing/poppler-utils/pdfimages.1.en.html
The hint from KenS was exactly what I was looking for - the PDF defines a CropBox that ImageMagick 7.1.0 was not using by default. The solution therefore is to modify the command to include the following -define information:
convert -define pdf:use-cropbox=true file.pdf /tmp/file.png
Thank you all for your help!

ImageMagick convert pdf with multiple pages to high quality PNG

I am trying to convert a multi-page PDF to one long png with the following command:
convert -append -flatten -density 300 in.pdf out.png
I am using -flatten to lose transparency, since I want a white background in the final PNG. The problem is that it takes only the first page instead of using all the pages.
How can I convert the PDF to one long PNG while losing the transparency and using a white background?
This command works for me on IM 6.9.9.22 Q16 Mac OSX with Ghostscript 9.21
convert -density 300 -colorspace sRGB itc101_13.pdf -alpha off -append out.png
If it does not work for you, then what is your ImageMagick version and what is your Ghostscript version.
You have your syntax wrong. You must read the PDF before applying append. Try
convert -density 300 -colorspace sRGB in.pdf +adjoin -append -background white -flatten out.png
If that does not work, then what is your ImageMagick version and platform? What is your Ghostscript version and your libpng version? Can post a link to your PDF file?
Note that +adjoin is not usually necessary for output to PNG, but won't hurt.

Linux PDF to TIFF Quality Issue

I am trying to use a linux application to convert .pdf files to .tiff for faxing, however, our clients have not been happy with the quality of GhostScript's tiffg4 device.
In the image below, the left side shows a conversion using GhostScript tiffg4 and the right is from an online conversion service. We are unable to see which application is being used to attain that quality.
Note: The output TIFF must be black & white
Ghostscript Code:
gs -sDEVICE=tiffg4 -dNOPAUSE -dBATCH -dPDFFitPage -sPAPERSIZE=letter -g1728x2156 -sOutputFile=testg4.tiff test.pdf
We have tried these GhostScript devices:
tiffcrle
tiffg3
tiffg32d
tiffg4
tifflzw
tiffpack
My question is this--does anyone know which application and/or setting is used to achieve the quality on the right?
Extending on BitBank's comment, you could write a RGB tiff and then use ImageMagick to convert to Group 4. ImageMagick allows you to control the dithering algorithm:
gs -sDEVICE=tiff24nc -dNOPAUSE -dBATCH -dPDFFitPage -sPAPERSIZE=letter -g1728x2156 -sOutputFile=intermediate.tiff your.pdf
convert intermediate.tiff -dither FloydSteinberg -compress group4 out.tiff
ImageMagick's manual has some background on the algorithm(s) and available options.

Convert multipage PDF to PNG with transparency

in the moment i run into several problems by converting a PDF file to PNG.
The transparence is lost from the source pdf file.
I have tested the following terminal tools to create the png:
GhostScript, Imagemagick and pdf tools from poppler-tools, always on a debian system.
The image should have the same dimension as the pdf, also the same transparency.
used commands:
gs -dNOPAUSE -sDEVICE=pngalpha -sOutputFile=test%d.png -r96 -q design.pdf -c quit
convert design.pdf test%d.png
convert design.pdf -channel rgba -alpha on PNG32:test%d.png
convert -background none -colorspace srgb design.pdf -colorspace srgb -channel rgba -alpha on PNG32:test%d.png
pdftoppm -png file.pdf test
The result is not the expected png with transparency. The Background is white, should be 100% transparent. Additionaly there is a green bar and should be semi-transparent. In all my tries the result ends up in a lighter green box with no transparency.
To see my result, i have uploaded the source pdf, faulty created png and the expected result (export from photoshop).
PDF: http://speedy.sh/W75HP/source-file.pdf
Result: http://speedy.sh/hfZMt/faulty-created-design.png
Expected: http://speedy.sh/7mpEk/design-the-way-it-should-be.png
I managed to get the white background to be transparent, but the actual file transparency including the semi-transparent green bar/box is not converted properly.
Whats the solution for my issue?
Best regards,
Chris
//UPDATE
Okay we have found a solution with another 3th party tool which produces my expected result on a easy way.
inkscape design.pdf -z --export-dpi=100 --export-png=design.png
Thx for help
Using Imagick (PHP Extension) I converted a background color into transparent with some code like this (I converted a JPG with white background into transparent PNG):
$mask = new Imagick('/your/file/path.jpg');
$mask->setImageFormat('png');
$mask->paintTransparentImage('white', 0, 1000); // $fuzz = 1000 (3rd parameter) is just a guess
Take a look here:
http://de3.php.net/manual/en/imagick.painttransparentimage.php
Corresponding Imagick documentation:
http://www.imagemagick.org/script/command-line-options.php#transparent
Regards,
Michael

Converting from pdf to png with ghostscript, result with many white boxes

I'm converting pdf (created with adobe illustrator) into transparent png file, with following command:
gs -q -sDEVICE=pngalpha -r300 -o target.png -f source.pdf
However, there's undesired white boxes in the resulting PNG, looks like it's auto generated by ghostscript, some bounding box. (see attached image)
Tryied both gs-9.05 and gs-9.10, same bad result.
I've tried to export to PNG file from Illustrator or Inkscape manually, the result is good.
What does Inkscape do to render it correct, and
How could I eliminate those white boxes using ghostscript?
Try mudraw of latest (1.3) muPDF, as far as I checked it creates nice PNGs from PDF files with 1.4 transparency:
mudraw -o out.png -c rgba in.pdf
"rgba" being, as you understand, RGB + alpha
In the general case, you can't. PDF does support transparency, but the underlying media is always assumed to be white and opaque. So anywhere that marks are made on the medium is no longer transparent, its white.
You don't say which version of Ghostscript you are using, but if its earlier than 9.10 you could try upgrading.