Convert a region of each PDF page to grayscale - pdf

I have a PDF that I want to print and a small region of each page has a thick rainbow at the left border. It is on each page. In order to save color ressources I would like to convert only this region to grayscale - or remove it completely with a white rectangle. I have looked into imagemagick but could not find a suitable solution to keep all the other color on the pages.
I have also thought of exporting each page to a separate PDF, apply a rectangle filter to each pdf and then combine it again. But I would prefer a simpler approach as the quality of the graphs seem to decrease each time I convert a pdf.

You do not have to extract each page to do that in ImageMagick. You can process it all in one command. Here is an example.
Create PDF:
convert lena.jpg mandril3.jpg zelda1.jpg test.pdf
Create white image:
convert -size 100x100 xc:white white.png
Apply white image to every page of PDF:
convert test.pdf null: white.png -geometry +50+50 -layers composite result.pdf

Related

How to cut an overlong PDF page in a series of A4 PDF pages with ImageMagick

I have an overlong PDF containing text and image elements. Here an example:
How can I split this page into a PDF with mutiple A4 pages where the first pages contains the top bit, the next page what's shown underneath and so on until the last page which holds the leftover at the bottom.
When I just 'print' the PDF to an new PDF (e.g. in Preview on macOS) in full-width mode only the middle of the input page is saved.
Can I do this easily with ImageMagick?
ps: If generating all pages is too difficult, having just the first page containing the top of the input PDF would be OK too.
If your PDF width at 72 dpi is 595, then in Imagemagick 6, you can do
convert image.pdf -crop 595x842 +repage +adjoin result.pdf
If using Imagemagick 7, you can do
magick image.pdf -crop "%[papersize:a4]" +repage +adjoin result.pdf

converting a pdf page to an image using GraphicsMagick

How do I convert only page 2 of a pdf file to a jpg image file, using GraphicsMagick command line prompt?
What option can I use in the gm.exe convert command?
gm.exe convert testing.pdf testing.jpg
Add the page number (starting from zero) in square brackets after the PDF filename:
gm.exe convert testing.pdf[1] testing.jpg
By the way, you can use the same indexing technique for accessing specific frames of a GIF animation, or layers of multi-layer/directory TIFFs.
use the blow command, will get high quality png with white background.
magick convert -density 300 -quality 100% -background white -alpha remove -alpha off ./646.04.pdf ./x.png

convert PDF to EPS to PNG without text

How to convert PDF to PNG (and filter out the text)..
I want to render images and vector graphics (vector text included) without plain text
Below only the image is extracted.. not the whole page of the PDF
gs -sDEVICE=eps2write -dFILTERTEXT -dFirstPage=1 -dLastPage=1 -o out.eps 091.pdf
convert -density 300 -background white -alpha off out.eps -resize 2480x3508! OUT.png
from EPS
PDF
The FILTERTEST switch (as I'm pretty sure is documented) only works on text, not on image which contain a pattern of pixels which look like text, or on vector linework which looks like text.
Wihtout seeing your EPS I can't tell if the text you are complaining about is text or not, but my guess would be not.
By the way, if you want a PNG then there's no reason to convert the EPS to PDF, just render the EPS directly to a PNG file.

I need detect the approximate location of QR code in scanned image (PDF converted to PNG)

I have many scanned document in PDF.
I use ImageMagick with Ghostscript to convert PDF to PNG in big density. I use convert -density 288 2.pdf 2.png. After that I read the pixels with PHP and find where is QR code and decode it. Because image is very big (~ 2500px), it's need very much RAM. I want, before I read pixels with PHP, to crop the image with ImageMagick and leave only that part with the QR code.
Can I detect the approximate location of QR code with ImageMagick, crop and leave only that part ?
Sample PDF
Converted PNG
Further Update
I see your discussion with Kurt about better extraction of the image from the PDF in the first place, and his recommendation was to use pdfimages. I just wanted to add that you won't find that if you do brew search pdfimages, but you actually need to use
brew install poppler
and then you get the pdfimages executable.
Updated Answer
If you change the tile size to 100x100 on the crop command and run this for the second PDF you supplied:
convert -density 288 pdf2.pdf -crop 100x100 tile%04d.png
and then use the same entropy analysis command
convert -format "%[entropy]:%X%Y:%f\n" tile*.png info: | sort -n
...
...
0.84432:+600+3100:tile0750.png
0.846019:+600+2800:tile0678.png
0.980938:+700+400:tile0103.png
0.984906:+700+500:tile0127.png
0.988808:+600+400:tile0102.png
0.998365:+600+500:tile0126.png
The last 4 listed tiles are
Likewise for the other PDF file you supplied, you get
0.863498:+1900+500:tile0139.png
0.954581:+2000+500:tile0140.png
0.974077:+1900+600:tile0163.png
0.97671:+2000+600:tile0164.png
which means these tiles
I would think that should help you pretty much approximately locate the QR code.
Original Answer
This is not all that scientific, but it may help you get started. The key, I think, is the entropy of the various areas of the image. The QR code has a lot of information encoded in a small area so it should have high entropy. So, I use ImageMagick to split the image into square 400x400 tiles like this:
convert image.png -crop 400x400 tile%03d.png
which gives me 54 tiles. Then I calculate the entropy of each of the tiles and sort them by increasing entropy, also outputting their offsets from the top left of the frame, and their name, like this:
convert -format "%[entropy]:%X%Y:%f\n" tile*.png info: | sort -n
0.00408949:+1200+2800:tile045.png
0.00473755:+1600+2800:tile046.png
0.00944815:+800+2800:tile044.png
0.0142171:+1200+3200:tile051.png
0.0143607:+1600+3200:tile052.png
0.0341039:+400+2800:tile043.png
0.0349564:+800+3200:tile050.png
0.0359226:+800+0:tile002.png
0.0549334:+800+400:tile008.png
0.0556793:+400+3200:tile049.png
0.0589632:+400+0:tile001.png
0.0649078:+1200+0:tile003.png
0.10811:+1200+400:tile009.png
0.116287:+2000+3200:tile053.png
0.120092:+800+800:tile014.png
0.12454:+0+2800:tile042.png
0.125963:+1600+0:tile004.png
0.128795:+800+1200:tile020.png
0.133506:+0+400:tile006.png
0.139894:+1600+400:tile010.png
0.143205:+2000+2800:tile047.png
0.144552:+400+2400:tile037.png
0.153143:+0+0:tile000.png
0.154167:+400+400:tile007.png
0.173786:+0+2400:tile036.png
0.17545:+400+1600:tile025.png
0.193964:+2000+400:tile011.png
0.209993:+0+3200:tile048.png
0.211954:+1200+800:tile015.png
0.215337:+400+2000:tile031.png
0.218159:+800+1600:tile026.png
0.230095:+2000+1200:tile023.png
0.237791:+2000+0:tile005.png
0.239336:+2000+1600:tile029.png
0.24275:+800+2400:tile038.png
0.244751:+0+2000:tile030.png
0.254958:+800+2000:tile032.png
0.271722:+2000+2000:tile035.png
0.275329:+0+1600:tile024.png
0.278992:+2000+800:tile017.png
0.282241:+400+1200:tile019.png
0.285228:+1200+1200:tile021.png
0.290524:+400+800:tile013.png
0.320734:+0+800:tile012.png
0.330168:+1600+2000:tile034.png
0.360795:+1200+2000:tile033.png
0.391519:+0+1200:tile018.png
0.421396:+1200+1600:tile027.png
0.421421:+2000+2400:tile041.png
0.421696:+1600+2400:tile040.png
0.486866:+1600+1600:tile028.png
0.489479:+1600+800:tile016.png
0.611449:+1600+1200:tile022.png
0.674079:+1200+2400:tile039.png
and, hey presto, the last one listed (i.e. the one with the highest entropy) tile039.png is this one.
I have drawn a rectangle around its location using this command
convert image.png -stroke red -fill none -strokewidth 3 -draw "rectangle 1200,2400 1600,2800" a.jpg
I concede there may be luck involved, but I only have one image to test my mad theories. You may need to tile twice, the second time with an x-offset and y-offset of half a tile width, so that you don't cut the QR code and split it across 2 tiles. You may need different size tiles for different size barcodes. You may need to consider the last 3-5 tiles located for your next algorithm. But I think it could form the basis of a method.

Centering a .jpg image in PDF using Imagemagick?

I'm using convert version ImageMagick 6.6.2-6 2011-03-16, and I'd like to use it to generate an A4 pdf from an image, where the image will be non-scaled and centered.
I'm running the following (as a modification of Overlaying Images with ImageMagick):
# generate a 100x100 JPG with just red color
convert -size 100x100 xc:red red.jpg
# generate PDF from JPG
convert -page A4 xc:white red.jpg -gravity center -composite -format pdf out.pdf
... but, basically nothing shows? Same thing happens for a png image...
Note that
Just 'convert -page A4 red.jpg out.pdf' works - but the image is not centered; (-gravity center causes image not to show)
If the image is png, 'convert -page A4 -gravity center red.png out.pdf' does indeed work fine
... however, I'd like convert to embed the contents of the JPEG stream directly - hence, I wouldn't like to convert the JPG to PNG first.
So, would it be possible to use convert to center a JPG image in an A4 PDF page directly?
Many thanks in advance for any answers,
Cheers!
EDIT2: #John Keyes answer works for the example above; where the image is "smaller" than the PDF size -- however if the image is bigger, e.g.:
$ convert -size 1228x1706 -background \#f44 -rotate 45 gradient:\#f00-\#fff red.jpg
$ identify red.jpg
red.jpg JPEG 2075x2075 2075x2075+0+0 8-bit DirectClass 120KB 0.000u 0:00.000
... then it will fail. However, it turns out: "if you change -extent to 50x50, then play with -gravity, you'll see changes" - except, the question is: which extent do you change, that of the image - or that of the final PDF?
Well, it turns out - it is the extent of the final PDF... To find that size as convert sees it, check the page: Magick::Geometry - however, note that the "Postscript page size specifications" like "A4+43+43>" unfortunately, cause convert to crash in this context... But at least the respective numbers for the size (595x842) can be copied from the page; and finally this works:
convert -page A4 -gravity center -resize 595x842 -extent 595x842 red.jpg out.pdf
... and actually, the -extent part is not really needed - the -resize part is the important one to have the large image show..
However, the problem here is that the image included seems to be resampled - however, I'd just like to show it scaled so it fits the page, but would otherwise like the original JPG stream to be inserted in the file.. So I guess the question is still partially open :)
EDIT: Related:
ImageMagick Gravity parameter - Stack Overflow
ImageMagick and Geometry Issue - resizing with > - Stack Overflow
command line - Resizing and croping images to an aspect ratio of 6x4 with width of 1024 pixels - Unix and Linux - Stack Exchange
conversion - using imagemagick or ghostscript (or something) to scale PDF to fit page? - Stack Overflow
The following works perfectly for me:
convert -page A4 red.jpg -gravity center -format pdf out.pdf
and if you change the order of the "files" it works too:
convert -page A4 red.jpg xc:white -gravity center -composite -format pdf out.pdf
I think the red.jpg is centered but the white is drawn on top of it.
Well, this is outside of imagemagick, but here is a solution in Latex, using tikz package (using How to define a figure size so that it consumes the rest of a page? #14514), which seems to reliably place images on page, and preserve them fully:
% note: need to run pdflatex twice!! First time generates blank pages!
% convert -size 1228x1706 -background \#f44 -rotate 45 gradient:\#f00-\#fff red.jpg
% convert -size 595x1400 xc:red redlong.jpg
\documentclass[a4paper]{letter}
\usepackage{graphicx}
\usepackage[hmargin=0.5cm,vmargin=0.5cm]{geometry} % sets page margins
\usepackage{tikz}
\usetikzlibrary{calc}
\newcommand{\imagepage}[1]{
\tikz[overlay,remember picture]\coordinate (image-start); \par
\vfill
\null\hfill
\begin{tikzpicture}[overlay,remember picture]
\path let \p0 = (0,0), \p1 = (image-start) in
node [inner sep=0pt,outer sep=0pt,anchor=center] at (current page.center) {%
\pgfmathsetmacro\imgheight{\y1-\y0}%
\includegraphics[height=\imgheight pt,width=\textwidth,keepaspectratio]{#1}%
};
\end{tikzpicture}%
}
\begin{document}
% there must be a \n\n after {letter!}
\begin{letter}
\imagepage{red.jpg}
\end{letter}
\begin{letter}
\imagepage{redlong.jpg}
\end{letter}
% see also \resizebox{\textwidth}{!}{\includegraphics{red.jpg}}
\end{document}
Note, the letter documentclass is used to allow that each image is split on separate page..