Converting from pdf to png with ghostscript, result with many white boxes - pdf

I'm converting pdf (created with adobe illustrator) into transparent png file, with following command:
gs -q -sDEVICE=pngalpha -r300 -o target.png -f source.pdf
However, there's undesired white boxes in the resulting PNG, looks like it's auto generated by ghostscript, some bounding box. (see attached image)
Tryied both gs-9.05 and gs-9.10, same bad result.
I've tried to export to PNG file from Illustrator or Inkscape manually, the result is good.
What does Inkscape do to render it correct, and
How could I eliminate those white boxes using ghostscript?

Try mudraw of latest (1.3) muPDF, as far as I checked it creates nice PNGs from PDF files with 1.4 transparency:
mudraw -o out.png -c rgba in.pdf
"rgba" being, as you understand, RGB + alpha

In the general case, you can't. PDF does support transparency, but the underlying media is always assumed to be white and opaque. So anywhere that marks are made on the medium is no longer transparent, its white.
You don't say which version of Ghostscript you are using, but if its earlier than 9.10 you could try upgrading.

Related

Converting pdf to eps without rasterizing or changing fonts

I have been trying to convert a pdf vector graphic to eps. I tried two commands from the following answer: https://stackoverflow.com/a/44737018/5661667
The inkscape command inkscape input.pdf --export-eps=output.eps or rather, since --export-eps is deprecated now,
inkscape input.pdf --export-filename=output.eps
nicely converts to a vectorized eps. However, it strangely converts my Times New Roman fonts (the graphic was originally created using matplotlib) to some sans serif font (looks like Arial or something).
The ghostscript version of the conversion from the linked answer
gs -q -dNOCACHE -dNOPAUSE -dBATCH -dSAFER -sDEVICE=eps2write -sOutputFile=output.eps input.pdf
keeps my fonts nicely. However, the eps seems to be rasterized despite the -dNOCACHE option.
Is there any way to get one of these to just convert my pdf to eps without modifying it?
Further info: I am using Mac OS. For the first part, my suspicion is that I only have an Arial Unicode.tff installed in /Library/Fonts/. I tried installing some other fonts, but no success for my conversion.
I had the same problem when trying to convert a powerpoint generated pdf to eps format using inkscape.
After trying with gs and disabling the transparency I noticed some areas turned black after eps conversion.
gs -q -dNOCACHE -dNOPAUSE -dBATCH -dSAFER -dNOTRANSPARENCY -sDEVICE=eps2write -sOutputFile=output.eps input.pdf
Coming back to inkscape I noticed that Powerpoint added some transparent objects in these areas that turned black. So I manually removed them using inkscape and when converting to eps again the result was perfect!
In short: if there are transparent elements in your pdf, the fonts will probably be rasterized during eps conversion. So, you need to remove these elements.
Maybe there is an easier way to identify them in inkscape.
In my case I was able to use Find/Replace (Ctrl+F) to search objects with string "clipPath" and with 'Search option = Properties'. Then I open the Objects Tab (Menu Object->Objects...) and use that to delete each transparent object generated by Powerpoint.

ImageMagick convert adds whitespace when converting PDF to PNG

I'm using ImageMagick to convert the following PDF to an PNG file.
Click here to download the PDF from IMSLP (Permalink if the direct download is broken)
In a PDF viewer it looks nice:
but when converting with convert -density 300 -background white -alpha off -alpha remove file.pdf /tmp/file.png
the image gets a large white margin:
I do not want to trim the image afterwards, I just want ImageMagick to somehow respect the view-port or however that viewing information is being encoded in the PDF. Does anyone know which command-line parameter might enable this behavior?
Edit 10.03.2022: I'm using ImageMagick 7.1.0.16 with Ghostscript 9.55.0 inside an Alpine Linux docker image.
I do not get your extra margin in ImageMagick 6.9.12-42 using Ghostscript 9.54. But changing the density does not seem to have any effect.
convert -density 300 -background white file.pdf[1] x2.png
The issues may be a malformed PDF. How was it created? Also what version of Ghostscript are you using? It could be a GS version issue.
If this was a scanned PDF that is a raster image in a vector PDF shell, then you could just use pdfimages to extract the raster files. See https://manpages.debian.org/testing/poppler-utils/pdfimages.1.en.html
The hint from KenS was exactly what I was looking for - the PDF defines a CropBox that ImageMagick 7.1.0 was not using by default. The solution therefore is to modify the command to include the following -define information:
convert -define pdf:use-cropbox=true file.pdf /tmp/file.png
Thank you all for your help!

Ghostscript convert PDF to JPG (CMYK profile) resolution error

I'm using Ghostcript to convert some PDF files to JPG. All is working when converting the program consider the resolution of 600dpi when converting and output jpeg quality is good.
Here is my code :
gs -sDEVICE=jpegcmyk -dTextAlphaBits=4 -r600 -dSAFER -dBATCH -dNOPAUSE -o my_output_file.jpg my_input_file.pdf
But when I open the file in Photoshop, the properties contains 72dpi instead of 600dpi I expected :
When I try with RGB profile for output, it is ok, I have got 600dpi.
So what I want is CMYK + 600dpi in image properties.
As can be seen from your screenshots, both images are of the same dimensions, 6803 by 709 pixels.
And that is all that matters.
Also, the size of the CMYK version is bigger by about 33% compared to the RGB version -- as is to be expected for an image with 4 color channels instead of 3.
Ghostscript used the -r600 CLI parameter to correctly expand the number of pixels when converting the PDF file.
Ghostscript does not add any EXIF metadata to its output when converting a PDF to raster.
The DPI or PPI information would be an internal metadata hint to tell any compliant viewers how big to render the image on screen. It would not change anything substantial in the image information itself.
Why Photoshop does think it should use 72 dpi for one, but 600 dpi for the other, you may ask Adobe about.
I bet Photoshop also renders the 72dpi file about 7 times larger on screen than the other. Is that the case?
P.S.: See also "What DPI do web images need to be?"

Convert multipage PDF to PNG with transparency

in the moment i run into several problems by converting a PDF file to PNG.
The transparence is lost from the source pdf file.
I have tested the following terminal tools to create the png:
GhostScript, Imagemagick and pdf tools from poppler-tools, always on a debian system.
The image should have the same dimension as the pdf, also the same transparency.
used commands:
gs -dNOPAUSE -sDEVICE=pngalpha -sOutputFile=test%d.png -r96 -q design.pdf -c quit
convert design.pdf test%d.png
convert design.pdf -channel rgba -alpha on PNG32:test%d.png
convert -background none -colorspace srgb design.pdf -colorspace srgb -channel rgba -alpha on PNG32:test%d.png
pdftoppm -png file.pdf test
The result is not the expected png with transparency. The Background is white, should be 100% transparent. Additionaly there is a green bar and should be semi-transparent. In all my tries the result ends up in a lighter green box with no transparency.
To see my result, i have uploaded the source pdf, faulty created png and the expected result (export from photoshop).
PDF: http://speedy.sh/W75HP/source-file.pdf
Result: http://speedy.sh/hfZMt/faulty-created-design.png
Expected: http://speedy.sh/7mpEk/design-the-way-it-should-be.png
I managed to get the white background to be transparent, but the actual file transparency including the semi-transparent green bar/box is not converted properly.
Whats the solution for my issue?
Best regards,
Chris
//UPDATE
Okay we have found a solution with another 3th party tool which produces my expected result on a easy way.
inkscape design.pdf -z --export-dpi=100 --export-png=design.png
Thx for help
Using Imagick (PHP Extension) I converted a background color into transparent with some code like this (I converted a JPG with white background into transparent PNG):
$mask = new Imagick('/your/file/path.jpg');
$mask->setImageFormat('png');
$mask->paintTransparentImage('white', 0, 1000); // $fuzz = 1000 (3rd parameter) is just a guess
Take a look here:
http://de3.php.net/manual/en/imagick.painttransparentimage.php
Corresponding Imagick documentation:
http://www.imagemagick.org/script/command-line-options.php#transparent
Regards,
Michael

ImageMagick - PDF resize without stretching interior content

I have a PDF of size 8.27in × 10.87in, but the printer at my work requires pages be US Letter size (8.5in x 11in). ImageMagick can achieve the resizing with:
convert -page Letter mypdf.pdf mypdf-converted.pdf
The resulting pdf is 8.5x11 and prints out fine, but the text and images in the pdf are stretched. What I want is exact copies of original 8.27x10.87 pages stuck on 8.5x11 pages with no stretching of the interior content. Ideally the content would be centered, but having the content aligned to one side is fine as well.
I've messed with some of the -filter options, but to no avail. Is there an option for what I'm describing?
Also, as a note: I tried printing the file to a PDF in Document Viewer (the default Linux PDF viewer) and adjusting the page settings, but that caused what I think are weird character encoding issues. For some reason I can view the pdf on my computer fine, but the printer prints out lots of "!"s and "%"s and other random characters in place of the text. The resulting PDF from ImageMagick doesn't cause any problems like this.
Thanks!
You can use our free cpdf tool:
Coherent PDF Command Line Tools Community Release
First, change the size of the pages:
cpdf -mediabox "0 0 8.5in 11in" in.pdf -o out.pdf
Then, shift the contents rightward and upward by half the difference in size:
cpdf -shift "0.115in 0.0515in" out.pdf -o out2.pdf
Or, all together:
cpdf in.pdf -mediabox "0 0 8.5in 11in" AND -shift "0.115in 0.0515in" -o out.pdf