How to convert a "landscape" PDF with "pdftocairo -eps" to a correct EPS? - pdf

I can't figure out how to convert a "landscape" PDF file with a single picture on it (eg. paperwidth=842 and paperheight=595 points with an image that fills the whole page) with the tool pdftocairo to an EPS file.
The output I get is either a shrinked version of the original file (width is scaled from 842 to 595 to fit the "incorrect" pagewidth 595 of the EPS file), or an EPS where the content between 595 and 842 is just cut off (with the -noshrink parameter).
Any ideas?
Edit1: pdftocairo version 0.43.0
Input PDF:
pdfinfo test.pdf
Creator: DocType PDF-Emitter (DocType PDF-Emitter v1.9.37-9-g1b2b6f3)
Producer: Haru Free PDF Library 2.3.0RC2
CreationDate: Mon Jun 4 11:18:30 2018
Tagged: no
UserProperties: no
Suspects: no
Form: none
JavaScript: no
Pages: 1
Encrypted: no
Page size: 595.44 x 841.68 pts
Page rot: 90
File size: 149246 bytes
Optimized: no
PDF version: 1.3
Landscape PDF
Converted with pdftocairo -eps test.pdf
Landscape EPS

The most recent version of Poppler (on my system, it's v0.42.0) has the following command line options which may be of help:
$ pdftocairo -h
pdftocairo version 0.42.0
Copyright 2005-2016 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1996-2011 Glyph & Cog, LLC
Usage: pdftocairo [options] <PDF-file> [<output-file>]
[....]
-eps : generate Encapsulated PostScript (EPS)
[....]
-r <fp> : resolution, in PPI (default is 150)
-rx <fp> : X resolution, in PPI (default is 150)
-ry <fp> : Y resolution, in PPI (default is 150)
-scale-to <int> : scales each page to fit within scale-to*scale-to pixel box
-scale-to-x <int> : scales each page horizontally to fit in scale-to-x pixels
-scale-to-y <int> : scales each page vertically to fit in scale-to-y pixels
-x <int> : x-coordinate of the crop area top left corner
-y <int> : y-coordinate of the crop area top left corner
-W <int> : width of crop area in pixels (default is 0)
-H <int> : height of crop area in pixels (default is 0)
-sz <int> : size of crop square in pixels (sets W and H)
-cropbox : use the crop box rather than media box
[....]
-level2 : generate Level 2 PostScript (PS, EPS)
-level3 : generate Level 3 PostScript (PS, EPS)
-origpagesizes : conserve original page sizes (PS, PDF, SVG)
-paper <string> : paper size (letter, legal, A4, A3, match)
-paperw <int> : paper width, in points
-paperh <int> : paper height, in points
-nocrop : don't crop pages to CropBox
-expand : expand pages smaller than the paper size
-noshrink : don't shrink pages larger than the paper size
-nocenter : don't center pages smaller than the paper size
[....]

Related

pdf2image converts page with page information that is not visible in the pdf viewer

I convert PDF to image using pdf2image which is python package.
But in result, PDF page information(?), which is not visible on pdf viewer, is appeared.
How can i remove page information on PDF, not on image?
PDF file link is https://1drv.ms/b/s!Ar1AW_VI_HwvkMAOyDmQhFEKrZnRWg?e=fvWEwN
The PDF Page data/information for viewing has been constrained by a "crop box" or "trim box" which in most cases would be identical to the paper "media box" However when using crop marks for printing or display the crop box area will be smaller than the media box area.
pdf2image has a setting to cover the use of crop boxes use_cropbox=True, (normal default is False) so in your invocation you would need to set that argument/option
use_cropbox
Uses the PDF cropbox instead of the default mediabox. This is a rather dark feature that should be set to true when the module does not seem to work with your data.
However looking into the file the values have been altered from expected so a source page is defined as
<< /CropBox [ 0 0 676 855] /MediaBox [ 0 0 676 856]...
thus there would be no noticeable difference, the 1 unit is only 1/72"
But 48 pages have later additional (LaTeX ?) crop box values of
<</CropBox[32.4 32.4 643.6 823.6]... and this seems to effect the issue of the trimmed viewport.
pdfinfo filename.pdf reports the cropped area Page size: 611.2 x 791.2 pts (letter)
For that reason (there are two conflicting settings) :-
Then without a working pdf2image set-up for testing, I am not 100% confident that the use_cropbox=True setting may always work reliably.
There are other methods that might work better and Ghostscript and other python dependency applications have similar, or alternate, means to clip the image output directly on the file. Using poppler direct we could get the same default output
However if we specify -cropbox the secondary crop, in this case, will be taken into account.
pdftoppm -png -cropbox "process data sheet.pdf" output
If that did not work we would need to define the exact area using
-x <int> : x-coordinate of the crop area top left corner
-y <int> : y-coordinate of the crop area top left corner
-W <int> : width of crop area in pixels (default is 0)
-H <int> : height of crop area in pixels (default is 0)

georeferencing with gdal goes wrong

although there are several pages dedicated to this task, it seems that i can not make that work. I have a non georeferenced tif file which I provide you the output from gdalinfo
Driver: GTiff/GeoTIFF
Files: Arctic_r05c02.2017235.terra.250m.tif
Size is 4096, 4096
Coordinate System is `'
Metadata:
TIFFTAG_SOFTWARE=ppm2geotiff v0.0.9
Image Structure Metadata:
INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left ( 0.0, 0.0)
Lower Left ( 0.0, 4096.0)
Upper Right ( 4096.0, 0.0)
Lower Right ( 4096.0, 4096.0)
Center ( 2048.0, 2048.0)
Band 1 Block=4096x1 Type=Byte, ColorInterp=Red
Band 2 Block=4096x1 Type=Byte, ColorInterp=Green
Band 3 Block=4096x1 Type=Byte, ColorInterp=Blue
I do have the corner coordinate extention in lat/lon. Hence I am creating Ground Control Points with thee following command:
gdal_translate -of GTiff -gcp 0 0 -180.0000 +63.1066 -gcp 0 4096 -161.5651 +68.5979 -gcp 4096 0 +161.5651 +68.5979 -gcp 4096 4096 -180.0000 +76.3728 input.tif output.tif
which results to:
Driver: GTiff/GeoTIFF
Files: Arctic_r05c02.2017235.terra.250mproj.tif
Size is 4096, 4096
Coordinate System is `'
GCP Projection =
GCP[ 0]: Id=1, Info=
(0,0) -> (-180,63.1066,0)
GCP[ 1]: Id=2, Info=
(0,4096) -> (-161.5651,68.5979,0)
GCP[ 2]: Id=3, Info=
(4096,0) -> (161.5651,68.5979,0)
GCP[ 3]: Id=4, Info=
(4096,4096) -> (-180,76.3728,0)
Metadata:
TIFFTAG_SOFTWARE=ppm2geotiff v0.0.9
Image Structure Metadata:
INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left ( 0.0, 0.0)
Lower Left ( 0.0, 4096.0)
Upper Right ( 4096.0, 0.0)
Lower Right ( 4096.0, 4096.0)
Center ( 2048.0, 2048.0)
Band 1 Block=4096x1 Type=Byte, ColorInterp=Red
Band 2 Block=4096x1 Type=Byte, ColorInterp=Green
Band 3 Block=4096x1 Type=Byte, ColorInterp=Blue
The next step is to resample with gdalwarp in order to create a projected file.
gdalwarp -r near -s_srs epsg:4326 -t_srs epsg:3413 -tr 250 250 input.tif output.tif
But the file seems distort:
Any idea what I can further do? Is there a way to directly change the corner coordinates inside the tiff file instead of adding GCP's? Would that help?
Some more information which i forgot. The data can be downloaded at:
https://lance-modis.eosdis.nasa.gov/imagery/subsets/?subset=Arctic_r05c02.2017235.terra.250m
and for the bounding box definition:
https://lance-modis.eosdis.nasa.gov/imagery/subsets/?subset=Arctic_r05c02.2017235.terra.250m.met
Since the box definition from NASA's side looks a bit odd, i tried the same procedure for the following tile:
https://lance-modis.eosdis.nasa.gov/imagery/subsets/?subset=Arctic_r04c03.2017235.terra.250m
and box definition:
https://lance-modis.eosdis.nasa.gov/imagery/subsets/?subset=Arctic_r04c03.2017235.terra.250m.met
Although the box looks ok the following image was produced:
The data you downloaded is already projected (in epsg:3413), you are assigning a projection (epsg:4326) and bounding-box which is not correct.
You can view the correct settings by visiting:
https://lance-modis.eosdis.nasa.gov/imagery/subsets/?subset=Arctic_r05c02.2017235.terra.250m.gdal
About halfway down on the page you'll see the gdal_translate command, use that with your in- and outputs.
For example:
gdal_translate -of GTiff -outsize 4096 4096 -projwin -2097152 2097152 -1048576 1048576 -a_srs "+proj=stere +lat_0=90 +lat_ts=70 +lon_0=-45 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs" Arctic_r05c02.2017235.terra.250m.tif Arctic_r05c02.2017235.terra.250m_output.tif

Tiling an image over a page with ImageMagick with print margins?

I am trying to get ImageMagick to do something for me and I am running into a few problems. First, I am not understanding units of measure and such passed into ImageMagick and so my script is not producing what I need. Second, the way I am doing it is extremely inefficient. Running this script takes a very long time (the one you see below is slightly trimmed down from what I am running).
So to what I am doing... I have a number of svg files with icons in them. I am looking to generate a page for each of these files. The page generated will contain the icon tiled over the entire page with a margin on the side. I am looking for 1/2 inch tiles with 1/2 margins around the page which needs to be a US Letter (8 1/2 x 11 inch).
After reading a lot of the documentation this is what I came up with.
colors=(red blue purple yellow green black)
mkdir -p generated/icons/
for color in ${colors[#]}; do
images=`printf "source/icons/${color}.svg%.0s " {1..300}`
montage $images -tile 15x20 -page Letter+1+1 -units PixelsPerInch -density 2550x3300 \
generated/icons/${color}.pdf
done
So for each of my files I run montage. I use printf to repeat the image file name 300 times. I then tile this 15x20 times. 15x20 comes from 8.5 minus 1 inch margins = 7.5*2 = 15 and likewise (11-1)*2 = 20. 300 images come from 15*20. I then say I want this on a letter page offset 1x1. (This was my attempt at a margin) I say I am speaking in pixel per inch (but none of the units seem to match up). I set the dpi to 300 by the density command where 8.5*300 = 2550 and 11*300 = 3300.
I've been toying with other settings (geometry etc.) but none of these are working. And the units don't seem to make sense either... Right now my resultant pdf is a square etc...
How do I make tiled pages as such? Also is there a way for me to do this more efficiently? What I have thus far is very slow.
EDIT:
Some more information:
i:montage --version
Version: ImageMagick 6.8.8-10 Q16 x86_64 2015-03-10 http://www.imagemagick.org
tile image:
my current output:
Notice margins not right, is square not a letter page, also tiles as skewed
Given the PNG image you provided, and I presume you want a 1 inch border of white all around inside an 8.5x11 inch printed image. Thus the tiled width would be 7.5 inches and tiled height would be 10 inches.
1 in = 300 dpi so border thickness = 300 px = 2 tiles thick
11-1 = 10 inches tall for tiled region height = 10*300 = 3000 px
8.5-1 = 7.5 inches wide for tiled region width = 7.5*300 = 2250 px
1 tile = 0.5 inches at 300 dpi = 0.5*300 = 150 px
convert lUDbK.png -resize "150x150!" -write mpr:tile +delete -size 2250x3000 tile:mpr:tile -bordercolor white -border 300 -units pixelsperinch -density 300 tiled_page.png
Time to process was 1.75 sec on my Mac Mini.
This produces an image which is rather large. You will have to extract the image to see the border, since this page background is white.
(Note that PNG only supports pixelspercentimeter, but IM converts my specification of pixelperinch accordingly. So if you look at the meta data, it will probably show you some other density in units of pixelspercentimeter. But they will correspond to the desired 300 dpi.)

PDFTK Rotating Pages Problem

I'm trying to use PDFTK to rotate pages in my PDF document. Executing something like the following should result in no changes to the page rotation:
pdftk in.pdf cat 1N output out.pdf
(This is rotating page 1 "north" or "0 degrees.")
In some PDF test documents, it works as expected (meaning, no changes to the page occurs). However, on some test documents, the PDF document is rotated 90 degrees. An additional 90 degrees is consistently applied to any page rotation I attempt to do. So, if I do this:
pdftk in.pdf cat 1E output out.pdf
(This is rotating page 1 "east" or "90 degrees.") The result is the page is rotated 180 degrees -- an additional 90 degrees!
The PDF looks OK when viewed in Acrobat Reader.
The only difference with these problem test PDF documents is that I used Acrobat Pro to already change their rotation. When applying PDFTK page roation on these already rotated PDF documents, I run into this problem.
Any idea what's going on?
To rotate page 1 by 90 degrees clockwise:
pdftk in.pdf cat 1E output out.pdf # old pdftk
pdftk in.pdf cat 1east output out.pdf # new pdftk
To rotate all pages clockwise:
pdftk in.pdf cat 1-endE output out.pdf # old pdftk
pdftk in.pdf cat 1-endeast output out.pdf # new pdftk
Similarly, to rotate all pages anti-clockwise:
pdftk in.pdf cat 1-endwest output out.pdf
When you use the "normal" rotation parameters (N, E, S, W), you are setting the rotation flag on the PDF pages to your parameter (e.g. 90 degrees). This does not take into account the current rotation setting. Here is the paragraph from the pdftk documentation about rotation:
The page rotation setting can cause pdftk to rotate pages and
documents. Each option sets the page rotation as follows (in
degrees): N: 0, E: 90, S: 180, W: 270, L: -90, R: +90, D:
+180. L, R, and D make relative adjustments to a page's rotation.
In addition to the NESW rotation settings, you also have the L, R and D options, that allow you to make relative adjustments that take the current rotation flag into account.
If that does not solve your problem, I would need access to a couple of test documents (one that does work correctly, and one that results in the wrong rotation setting).

Tile/concatenate high resolution PDF files with imagemagick

I've 9 high quality PDF files. I want to merge them into one large PDF of 3x3. I then want to turn this into a PNG file. I want to keep the resolution/sharpness during this process so that on the resulting PNG I can zoom right in and still see the fine detail. I thought I might do this with imagemagick but I'm struggling. Any ideas please?
I've tried this to merge them together to start with. It works, but the quality doesn't remain.
montage input_*.pdf -background none -tile 3x3 -geometry +0+0 output.pdf
Please note that file size and size of resulting image isn't an issue. I've no need to print it or anything like that. It's for viewing on a computer only.
Here is a sample of three of the PDF files:
1) https://www.dropbox.com/s/qc094jg1nkfk0jw/input_1.pdf?dl=0
2) https://www.dropbox.com/s/gb4u8r7bxg8lw2r/input_2.pdf?dl=0
3) https://www.dropbox.com/s/97dhi42wrvfxfd2/input_3.pdf?dl=0
Each PDF is 1071 x 1800 pts (using pdfinfo).
Thanks
James
Rather than stick with PDF and then merge and then convert to PNG, you may be better to extract the images as PNG in the first place and then concatenate the PNG files like this:
pdfimages -png input_1.pdf a
pdfimages -png input_2.pdf a
pdfimages -png input_3.pdf a
# Combine them side by side
montage a-*png -background none -tile 3x3 -geometry +0+0 output.png
# Or combine with "convert"
convert a-*.png +append result.png
The second document seems to have a mask...
pdfimages -list input_1.pdf
page num type width height color comp bpc enc interp object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
1 0 image 12000 20167 icc 3 8 image no 9 0 807 807 1260K 0.2%
pdfimages -list input_2.pdf
page num type width height color comp bpc enc interp object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
1 0 image 12000 20167 icc 3 8 image no 9 0 807 807 5781K 0.8%
1 1 smask 12000 20167 gray 1 8 image no 9 0 807 807 230K 0.1%
pdfimages -list input_3.pdf
page num type width height color comp bpc enc interp object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
1 0 image 12001 20167 icc 3 8 image no 9 0 807 807 2619K 0.4%