ImageMagick PDF to JPEG conversion results in green square where image should be - pdf

I'm attempting to convert a PDF to a JPEG using ImageMagick.
The PDF:
baby_aRCWTU.pdf
The command:
convert -density 260 -profile 'SWOP.icc' -profile 'sRGB.icm' 'baby_aRCWTU.pdf' 'baby_aRCWTU.jpg'
The resulting JPEG:
baby_aRCWTU.jpg
As you can see, the text is rendered nicely, but the embedded image shows up as a green square. Any ideas? This occurs with and without the colour profiles.
edit: reposted due to broken links

On a site we convert hundreds of PDF's on a daily basis where we need to create JPGs and we found it only reliable to convert the PDF's to postscript first.
We use the "pdftops" command, try
pdftops baby_aRCWTU.pdf baby_aRCWTU.ps
then your convert command above, but on the ps. Works for me, the image is then included.

Related

ImageMagick convert adds whitespace when converting PDF to PNG

I'm using ImageMagick to convert the following PDF to an PNG file.
Click here to download the PDF from IMSLP (Permalink if the direct download is broken)
In a PDF viewer it looks nice:
but when converting with convert -density 300 -background white -alpha off -alpha remove file.pdf /tmp/file.png
the image gets a large white margin:
I do not want to trim the image afterwards, I just want ImageMagick to somehow respect the view-port or however that viewing information is being encoded in the PDF. Does anyone know which command-line parameter might enable this behavior?
Edit 10.03.2022: I'm using ImageMagick 7.1.0.16 with Ghostscript 9.55.0 inside an Alpine Linux docker image.
I do not get your extra margin in ImageMagick 6.9.12-42 using Ghostscript 9.54. But changing the density does not seem to have any effect.
convert -density 300 -background white file.pdf[1] x2.png
The issues may be a malformed PDF. How was it created? Also what version of Ghostscript are you using? It could be a GS version issue.
If this was a scanned PDF that is a raster image in a vector PDF shell, then you could just use pdfimages to extract the raster files. See https://manpages.debian.org/testing/poppler-utils/pdfimages.1.en.html
The hint from KenS was exactly what I was looking for - the PDF defines a CropBox that ImageMagick 7.1.0 was not using by default. The solution therefore is to modify the command to include the following -define information:
convert -define pdf:use-cropbox=true file.pdf /tmp/file.png
Thank you all for your help!

Convert LaTex PDF from Vector to Raster

In my battle against cheating (I am a teacher), I would like to convert my LaTex PDF's to images so that students cannot cut and paste out of the files. I am currently using ImageMagick to do this:
convert -density 300 mwe.pdf mwe_convert.pdf
While this works, when zoomed insufficiently on the PDF aliasing causes unfortunate problems. My understanding is that antialiasing is on by default in convert (https://legacy.imagemagick.org/Usage/antialiasing/)
If I instead use the following command, things are better:
convert -density 800 -resample 300 mwe.pdf mwe_convert.pdf
With this PDF, as I zoom in and out it does a better job of preserving fine lines in the file. I'm guessing I'm getting some different type of effective antialiasing by doing this, but the options to convert have outsmarted me.
The problem is that using density and resample causes the paper size in mwe_convert.pdf to be 3.19 × 4.12 inch (according to identify). This means that viewing the file causes it to open small on the screen, and that you need crazy magnifications to make it readable on the screen.
So my question for the crowd is, is there (a) a way to do the density/resample and get the correct paper size at the end, or (b) a better way to achieve my goal.
I cannot include a PDF here as an MWE. I can show what I'm seeing. Here's a screen shot of the original LaTex PDF.
Here's a screen shot of the -density 300 without the resample:
Here's a screen shot of the -density 800 -resample 300. Note that the PDF was even smaller on the screen and the equals sign was still visible.
Here is a way to get back to letter-sized pages:
pdfjam --outfile converted_pdf_file.pdf --letterpaper letter_sized_pages.pdf
Said pdfjam command is found in the texlive-extra-utils package (for Debian).

Ghostscript convert PDF to JPG (CMYK profile) resolution error

I'm using Ghostcript to convert some PDF files to JPG. All is working when converting the program consider the resolution of 600dpi when converting and output jpeg quality is good.
Here is my code :
gs -sDEVICE=jpegcmyk -dTextAlphaBits=4 -r600 -dSAFER -dBATCH -dNOPAUSE -o my_output_file.jpg my_input_file.pdf
But when I open the file in Photoshop, the properties contains 72dpi instead of 600dpi I expected :
When I try with RGB profile for output, it is ok, I have got 600dpi.
So what I want is CMYK + 600dpi in image properties.
As can be seen from your screenshots, both images are of the same dimensions, 6803 by 709 pixels.
And that is all that matters.
Also, the size of the CMYK version is bigger by about 33% compared to the RGB version -- as is to be expected for an image with 4 color channels instead of 3.
Ghostscript used the -r600 CLI parameter to correctly expand the number of pixels when converting the PDF file.
Ghostscript does not add any EXIF metadata to its output when converting a PDF to raster.
The DPI or PPI information would be an internal metadata hint to tell any compliant viewers how big to render the image on screen. It would not change anything substantial in the image information itself.
Why Photoshop does think it should use 72 dpi for one, but 600 dpi for the other, you may ask Adobe about.
I bet Photoshop also renders the 72dpi file about 7 times larger on screen than the other. Is that the case?
P.S.: See also "What DPI do web images need to be?"

imagemagick - generated PDFs are blurry/hazy

I have an image of a website which I'm trying to convert to PDF. I have the image in several formats: PSD, PNG, JPG, TIFF, all saved losslessly.
I'm using the following command to convert the image to PDF:
convert -density 93 foo.jpg bar.pdf
Here is part of the original image:
And here is the same part, after converting to PDF:
As you can see, the second one is ever so slightly hazy. What's causing this, and how can I eliminate it? I've seen PDFs with crisp graphics, so I know it's possible.
If you are seeing the same results with multiple input types. The fuzziness is likely being caused by the anti-aliasing feature of your PDF Viewer. If using Acrobat, you can turn off image anti-aliasing by doing the following:
Go to Edit-->Preferences-->Page Display
Untick the option "Smooth Images" and hit "OK".
The crisp graphics you are seeing on other PDFs are likely due to the fact that they are vectorized graphics. Imagemagick is creating a PDF and embedding your image inside of it which may be subject to compression.
Also:
When using jpeg as input, add the "-quality 100" to your Imagemagick call to retain the highest quality possible.
Use a higher value for the "-density" parameter (I would recommend at least 150) to generate a higher resolution PDF.

ImageMagick convert produces a darker CMYK PDF than PhotoShop

The ImageMagick (IM) result of this command
convert myRGB.png -colorspace cmyk cmyk.pdf <br>
is not as bright or as close to the screen colors as a Photoshop produced CMYK PDF. myRGB.png is a PNG file produced using GIMP.
I don't own Photoshop, and would like to stick with open source tools.
The current Ubuntu release of of IM is 6.7.7. That IM version produces very dark, totally unusable, CMYK PDF.
I built 7.0.2-6 Q16 from source on Ubuntu 14.0.4, after also building LCMS package from source, and the above command works better, but the CMYK PDF as stated above is less bright and less close to the screen colors than the similar Photoshop output. E.g. blacks are not totally black; the sky color is dull blue instead of bright blue/cyan.
I've tried using ICC files downloaded from Adobe as in the following
convert myRGB.png -colorspace cmyk -profile WebCoatedSWOP2006Grade5.icc cmyk.pdf
I've tried this command with all 14 Adobe ICC files and there is no difference in any of them. Although, I admit I do not understand under what circumstances the ICC comes into play or if it is appropriate to this problem at all.
The simple question is why does IM convert tool not match the Photoshop results for CMYK?
The second question is, if IM can't be made to do it: is there any open source tool or tools that can match the Photoshop results for producing a CMYK PDF from and RGB PNG?
There are two applications involved, as you presumably know since you tagged this with Ghostscript. You haven't said which version of Ghostscript you have installed but the first thing I would do is remove ImageMagick from the equation.
Find out whether IM is having Ghostscript produce RGB or CMYK output, my bet is that it is getting RGB from GS. You'll need to find out what Ghostscript command line IM is using and I can't tell you how to do that. Assuming that the Ghostscript output is RGB then this would explain why altering the IM settings makes no difference.
Proceeding on the assumption that the above is correct, use the png16m device in Ghostscript to produce RGB PNG files directly, this reduces the scope of the problem:
gs -sDEVICE=png16m -o out.png input.pdf
Now, you don't say what version of Ghostscript you have installed, but assuming its relatively recent you can look in the /ghostpdl/doc directory and find considerable information on using colour management in Ghostscript, the document GS9_Color_Management.pdf may be helpful. It will certainly give you a myriad of opportunities to alter the output.