Tiling an image over a page with ImageMagick with print margins? - pdf

I am trying to get ImageMagick to do something for me and I am running into a few problems. First, I am not understanding units of measure and such passed into ImageMagick and so my script is not producing what I need. Second, the way I am doing it is extremely inefficient. Running this script takes a very long time (the one you see below is slightly trimmed down from what I am running).
So to what I am doing... I have a number of svg files with icons in them. I am looking to generate a page for each of these files. The page generated will contain the icon tiled over the entire page with a margin on the side. I am looking for 1/2 inch tiles with 1/2 margins around the page which needs to be a US Letter (8 1/2 x 11 inch).
After reading a lot of the documentation this is what I came up with.
colors=(red blue purple yellow green black)
mkdir -p generated/icons/
for color in ${colors[#]}; do
images=`printf "source/icons/${color}.svg%.0s " {1..300}`
montage $images -tile 15x20 -page Letter+1+1 -units PixelsPerInch -density 2550x3300 \
generated/icons/${color}.pdf
done
So for each of my files I run montage. I use printf to repeat the image file name 300 times. I then tile this 15x20 times. 15x20 comes from 8.5 minus 1 inch margins = 7.5*2 = 15 and likewise (11-1)*2 = 20. 300 images come from 15*20. I then say I want this on a letter page offset 1x1. (This was my attempt at a margin) I say I am speaking in pixel per inch (but none of the units seem to match up). I set the dpi to 300 by the density command where 8.5*300 = 2550 and 11*300 = 3300.
I've been toying with other settings (geometry etc.) but none of these are working. And the units don't seem to make sense either... Right now my resultant pdf is a square etc...
How do I make tiled pages as such? Also is there a way for me to do this more efficiently? What I have thus far is very slow.
EDIT:
Some more information:
i:montage --version
Version: ImageMagick 6.8.8-10 Q16 x86_64 2015-03-10 http://www.imagemagick.org
tile image:
my current output:
Notice margins not right, is square not a letter page, also tiles as skewed

Given the PNG image you provided, and I presume you want a 1 inch border of white all around inside an 8.5x11 inch printed image. Thus the tiled width would be 7.5 inches and tiled height would be 10 inches.
1 in = 300 dpi so border thickness = 300 px = 2 tiles thick
11-1 = 10 inches tall for tiled region height = 10*300 = 3000 px
8.5-1 = 7.5 inches wide for tiled region width = 7.5*300 = 2250 px
1 tile = 0.5 inches at 300 dpi = 0.5*300 = 150 px
convert lUDbK.png -resize "150x150!" -write mpr:tile +delete -size 2250x3000 tile:mpr:tile -bordercolor white -border 300 -units pixelsperinch -density 300 tiled_page.png
Time to process was 1.75 sec on my Mac Mini.
This produces an image which is rather large. You will have to extract the image to see the border, since this page background is white.
(Note that PNG only supports pixelspercentimeter, but IM converts my specification of pixelperinch accordingly. So if you look at the meta data, it will probably show you some other density in units of pixelspercentimeter. But they will correspond to the desired 300 dpi.)

Related

ImageMagick converts PDF into tiny images despite setting density and resize options

I'm using ImageMagick to convert the following PDF to an PNG file: PDF from IMSLP (Permalink)
In a PDF viewer it looks nice (even though it needs quite a bit of zooming):
but when converting with
convert "file.pdf" "/tmp/file.png"
the produced image gets an extremely low resolution:
when adding density and resize information, I get somewhat bigger images, but still not the original resolution that is stored within the PDF (certainly not 300 DPI)
convert -density "300" -resize "3000x3000>" "file.pdf" "/tmp/file.png"
When using Poppler-Utils' pdfimages, I'm getting the appropriate image:
My question is: Is there any way to tell ImageMagick to extract the images in the "correct" resolution (as is stored in the PDF document)? In other words, ignore the zoom that is necessary to view the PDF properly, thus extracting the correct image resolution?
I'm using ImageMagick 7.1.0.16 with Ghostscript 9.55.0 inside an Alpine Linux docker image.
Very unusual structure you have there its been through many changes but we can guess some pages may have been converted to 300 dpi or 600 dpi since they all render at roughly the same size.
Note that graphics dpi is subjective it is not that value that's used inside a PDF it is the the pixels per default of 72 point units that relate to a graphics working dpi. the image may have been 75 dpi but stored at 300 pixels per 72 points.
1st Analysis says images are
image-0028 = 714 X 900 dots nominally 600 dpi
image-0002 = 726 X 900 dots nominally 600 dpi
image-0005 = 674 x 900 dots nominally 600 dpi
image-0008 = 674 x 900 dots nominally 600 dpi
image-0011 = 674 x 900 dots nominally 600 dpi
image-0014 = 674 x 900 dots nominally 600 dpi
but all have been down-sampled to various sizes approx. 1.2" x 1.5" so a sensible source size to match all those reductions is possibly
9.6" x 12" with some cropping.
Thus to get the nearest original quality extract pages # 600 dpi (lossless png would be best to keep those lossy jpeg flaws)
Then reconvert them to 75 dpi should give you the closest to the poor quality inputs.
You need to increase your density much larger and put your resize after reading the input in Imagemagick.
This will be 5800 × 7200 pixels:
convert -density 4800 IMSLP358086-PMLP578359-Ehr_OP_20_5.pdf[1] x.png
This will be 2417 × 3000 pixels:
convert -density 4800 IMSLP358086-PMLP578359-Ehr_OP_20_5.pdf[1] -resize "3000x3000>" y.png

How to avoid gray outline artefacts when converting an eps image to pdf?

To generate vector graphics figures with LaTeX labels, I use gnuplot and the cairolatex terminal, creating the image via plot "data.txt" u 1:2:3 matrix with image notitle followed by:
latex figuregen.tex
dvips -E -ofile.eps figuregen
# Correct the bounding box automatically:
epstool --copy --bbox file.eps filename.eps
## Create a pdf:
ps2pdf -dPDFSETTINGS=/prepress -dSubsetFonts=true -dEmbedAllFonts=true -dMaxSubsetPct=100 -dCompatibilityLevel=1.3 -dEPSCrop filename.eps filename.pdf
Here is a zoom on a specific region of the original eps image:
White regions actually correspond to NaN values in the data file.
Now using the pdf file converted from eps:
In the pdf version, there are now unwanted outlines around all the NaN pixels, creating an awful lot of noise in the higher portion of the image.
I want to have these images as pdf, free of artefacts, and preserve high-quality LaTeX labels. I suspect that there might be a ps2pdf option to deactivate this kind of unwanted behaviour but I just cannot find it.
I tried things such as: -dGraphicsAlphaBits=1, -dNOINTERPOLATE, -dALLOWPSTRANSPARENCY, -dNOTRANSPARENCY, -dCompatibilityLevel=1.4 or -dCompatibilityLevel=1.5, but without success.
I also tried fixing this directly in gnuplot, but without success (see e.g. below).
Would any of you know how to solve this issue?
Thank you very much for your time!
EDIT
What's even more surprising and problematic is that these artefacts also appear when printed.
Note however that they do not appear at extreme levels of zoom in evince when only a small part of the data set is plotted.
MWE:
# plot.plt
set size ratio -1
set palette defined ( 0 '#D73027', 1 '#F46D43', 2 '#FDAE61', 3 '#FEE090', 4 '#FFFFD9', 5 '#E0F3F8', 6 '#ABD9E9', 7 '#74ADD1', 8 '#4575B4' )
#set yr [300:0] ### no artefacts if zoom is higher than 1310% in evince
set yr [400:100] ### no artefacts if zoom is higher than 1780% in evince
#set yr [450:0] ### artefacts at all zoom levels if we show more data, or all of it
set term cairolatex dashed color; set output "temp.tex"
plot "data.txt" u 1:2:3 matrix with image notitle
set output #Closes the temporary output file.
!sed -e 's|/Title|%/Title|' -e 's|/Subject|%/Subject|' -e 's|/Creator|%/Creator|' -e 's|/Author|%/Author|' < temp.tex > graph.tex
and, for completeness:
% figuregen.tex
\documentclass[dvips]{article}
\pagestyle{empty}
\usepackage[dvips]{graphicx} %
\begin{document}
\input graph.tex
\end{document}
If needed, part of the data can be found in text form here; enough to reproduce the issue: https://paste.nomagic.uk/?e0343cc8f759040a#DkRxNiNrH6d3QMZ985CxhA21pG2HpVEShrfjg84uvAdt
EDIT 2
In fact, same artefact issues appear when using set terminal cairolatex pdf
set terminal cairolatex standalone pdf size 16cm,10.5cm dashed transparent
set output "plot.tex"
directly with pdflatex
gnuplot<plot.plt
pdflatex plot.tex
(Note, this is using Gnuplot Version 5.2 patchlevel 6).
The actual problem is, that NaN values are set to transparent black pixels (#00000000).
The transparency causes these gray outline artifacts, depending on the zooming level. If you zoom close enough, then you see no artifacts.
But as soon as the image pixels are smaller than your monitor pixels, the values are interpolated for screen display. Its seems that pdf viewers like evince (I tested also okular and mupdf) interpolate both color and alpha channels, so that the alpha value of the Nan pixels is changed, and the underlying black appears as gray border around the color pixels.
I tried several ways. The easiest one, which actually worked for me was to use the tikz terminal with option externalimages which saves images created with image as separate png file.
These png file also contains transparency, and the final result has the same artifacts.
But you can use imagemagick's convert to change the transparent NaN pixels of the png to white with
convert temp.001.png -alpha off -fill white -opaque black temp.001.png
So, a fully working plot file is
# plot.plt
set size ratio -1
set palette defined ( 0 '#D73027', 1 '#F46D43', 2 '#FDAE61', 3 '#FEE090', 4 '#FFFFD9', 5 '#E0F3F8', 6 '#ABD9E9', 7 '#74ADD1', 8 '#4575B4' )
set ytics 100
set yrange reverse
set term tikz standalone externalimages background "white"; set output "temp.tex"
plot "data.txt" u 1:2:3 matrix with image notitle
# temp.001.png is the external image which contains only the 'with image' part
# We must remove the #00000000 color, which represents the NaN pixels
# I couldn't replace the colors directly, but I could first remove the alpha channel
# and then change black to white, because no other black pixel appear
!convert temp.001.png -alpha off -fill white -opaque black temp.001.png
set output #Closes the temporary output file.
!sed -e 's|/Title|%/Title|' -e 's|/Subject|%/Subject|' -e 's|/Creator|%/Creator|' -e 's|/Author|%/Author|' < temp.tex > graph.tex
!pdflatex graph.tex
Mupdf screen shot for graph.pdf:
Note, that I used standalone to be able to directly compile the resulting file, so that I could check the result.
A more cumbersome alternative would be to "manually" plot with image to a png file, and include that in a second plot, like I described in Big data surface plots: Call gnuplot from tikz to generate bitmap and include automatically? Then you can have more influence on how the png is generated.
Just for the records, with image pixels seems to do the "trick" and will create a file without grey surrounding of NaN datapoints. Tested with gnuplot 5.2.6.
plot FILE u 1:2:3 matrix with image pixels notitle
Code:
### avoid shading around NaN datapoints
reset session
set size ratio -1
FILE = "data.txt"
set palette defined ( 0 '#D73027', 1 '#F46D43', 2 '#FDAE61', 3 '#FEE090', 4 '#FFFFD9', 5 '#E0F3F8', 6 '#ABD9E9', 7 '#74ADD1', 8 '#4575B4' )
set term cairolatex dashed color
set output "temp.tex"
plot FILE u 1:2:3 matrix with image pixels notitle
set output
### end of code
Result: (a PNG of a screenshot, since it looks like I cannot add a PDF here)

How to get DPI of a PDF file?

Using ImageMagick or GhostScript or any PHP code how can I get the DPI value of PDF files?
Here is the link for two demo files
http://jmp.sh/O5g5wL4 -- of 72 DPI
http://jmp.sh/RxrnYrY -- of 300 DPI
I have used
$image = new Imagick();
$image->readImage('xyz.pdf');
$resolutions = $image->getImageResolution();
It gives the same result for two different PDF files having different DPI.
I have also used
pdfimages -list xyz.pdf
It gives a list of all information but how to fetch the DPI value from the list.
How to get the exact DPI value of a PDF?
As fmw42 says PDF files themselves have no resolution. However in your case both the files consist of nothing but an image. In one case the image is ~48 MB and in the other its around 200 MB.
The reason is that the images have a different effective resolution.
In PDF the image is simply a bitmap, a sequence of coloured pixels. These are then drawn onto the underlying media. At this point there is no resolution, the pixels are laid down in a specific media size. In your case 22 inches by 82 inches.
The effective resolution is given by dividing the dimension by the number of pixels in the image in that dimension.
So if I have an image which is 1000x1000 pixels, and I draw it in a 1 inch square, then the effective resolution of the image is 1000 dpi. If I change my mind and draw it in a square 4 inches by 4 inches, then the effective resolution is 250 dpi.
The image hasn't changed, just the area it covers.
Now consider I have two images drawn in 1 inch squares. the first image is 1000x1000, the second is 500x500. The effective resolution of the first image is 1000 dpi, the effective resolution of the second is 500 dpi.
So you can see that, in PDF, the effective resolution of the image is a combination of the dimensions of the image, and the dimensions of the media it covers.
That's a difficult thing to measure in a PDF file. The area covered is calculated using matrix algebra and can be a combination of several different matrices.
The actual dimensions of the image, by contrast are quite easy to determine, they are given in the image dictionary. Your images are: 1620x5868 and 3372x12225. In both cases the media is the same size; 22.5x81.5 inches.
Since the images cover the entire media, the effective resolutions are;
1620/22.5 = 72 by 5868/81.5 = 72
3372/22.5 = 149.866 by 12225/81.5 = 150
I think MuPDF will give you image dimensions and media dimensions, assuming all your PDF files are constructed like this you can then simply perform the maths, but note that this won't be so simple for ordinary PDF files where images don't cover the entire media.
Using mutool info -I -M 150-dpi.pdf gives:
Retrieving info from pages 1-1...
Mediaboxes (1):
1 (6 0 R): [ 0 0 1620 5868 ]
Images (1):
1 (6 0 R): [ DCT ] 3375x12225 8bpc DevCMYK (12 0 R)
So there's your image dimensions and your media size. All you need to do is apply the division of one by the other.
Note: In debian and related distros, mutool is contained in mupdf-tools package, not in mupdf package itself. It can by therefore installed by sudo apt install mupdf-tools.
I use pdfimages -list from the poppler library, gives you all the information about the images.

GraphicsMagick crop PDF

I've got a 8.5x11 PDF at 300dpi. It has a single UPC label in the top left corner of the PDF. Imagine that there could be 30 labels on a 1 sheet, but we just have 1 label.
I'm trying to crop the PDF to be just the size of the 1 label. So far I've got this
gm convert -density 300 single.pdf out.pdf
Which doesn't do any cropping. When I crop to say 300x100 it makes a 20MB file with 30000 pages.
I have not a clue how to use -crop to actually crop to the correct size. I need it to be 3.5inches by 1.125 inches.
Using the following input PDF (here converted to a PNG):
the following command will crop the label:
gm wiz.pdf -crop 180x50+1+1 cropped.pdf
This label is sized 180x50 pixels.
For an 8.5x11in PDF at 300 PPI you'd have a 2450x3300 pixels PDF (which I doubt you do, but that's another question) and you'd need to use -crop 1050x337+0+0 (more exactly, 1050x337.5+0+0 -- but you cannot crop half pixels!).
Note, the +0+0 part crops the top left corner. If you need offset to the right by N pixels and to the bottom by M pixels use +N+M...
Using ImageMagick instead...
You could also use ImageMagick's convert command:
convert wiz.pdf[180x50+1+1] cropped.pdf
Comment about image sizes...
One additional comment about this remark:
"I have not a clue how to use -crop to actually crop to the correct size."
There is no other real size for raster images than pixels. ABC pixels wide and XYZ pixels high...
There is no such thing as an absolute, real size for a digital image that you can measure in inches... unless you additionally can state the resolution at which a given image is rendered on a display or a print device!
An 8.50x11in sized image at 300 PPI will translate to 2550x3300 pixels.
However, if your image does not contain this amount of pixels (which is the real, absolute size of any raster image), you may still be able to render it at 300 PPI -- but its size in inches will be different from 8.5x11in!
So, whenever you want to crop, use the absolute number of pixels you want. Don't use resolution/density at all on your command line!

Image Magick: Image optimization for websites

I have a camera which produces photographs of 3008x2000 pixels. I use Image Magick to scale and resize the photos to be put up on my website. The size of the images I am using on the website is 602x400. I use this command to reduce the size:
convert DSC_0124.JPG -scale 20% -size 24% img1.jpg
This produces an image which is 602x400 pixels in size. But the file size will be always above 250KB. More images on a single html page means the page will be heavier and loading time will be longer. Are there any features in image magic that will help me to keep the file size as small as possible, possibly, below 100KB. But the image size should be the same, that is, 602x400px. I have achieved similar optimisation with SEAMonster tool for MS Windows. As it doean't have a commandline alternative, it wouldn't be of much help when there are hundreds of images to be converted.
Use command as Delan proposed with additional "-strip" flag to remove EXIF data, this have reduced the size of some of my images drastically. Here is a bash script for unix platforms, but you can use the second part only for individual images.
for X in *.jpg; do convert "$X" -resize 602x400 -strip -quality 86 "$X"; done
This will convert all images in the directory.
Use -quality to set the compression level:
convert DSC_0124.JPG -scale 20% -size 24% -quality [0..100] img1.jpg
You can define the maximum size of the output image at 100KB like this:
convert DSC_0124.JPG -resize 602x400! -strip -define jpeg:extent=100KB img1.jpg
If you are running your website on PHP, you might want to consider the SLIR image resizing script, it does a great job resizing to various constraints (see below) and caches the results.
Parameters:
w Maximum width
h Maximum height
c Crop ratio
q Quality
b Background fill color
p Progressive
http://shiftingpixel.com/2008/03/03/smart-image-resizer/
http://code.google.com/p/smart-lencioni-image-resizer/