Acrobat adobe can not open pdf file generated by gnuplot - pdf

I plot my data by gnuplot, then set output as pdf file.
When I use adobe to open such pdf file, it shows this error:
Adobe Acrobat Reader could not open 'Compare_E.pdf' because it is either not a supported file type or because the file has been damaged (for example, it was sent as an email attachment and wasn't correctly decoded).
Here is the plot file I used:
#!/usr/bin/gnuplot
set encoding iso_8859_1
set terminal postscript eps enhanced color font "Helvetica" 14
set output "Da.pdf"
set xr[0:1000]
set yr[0:]
set xlabel "Time (ps)"
set ylabel "Da(\305^{2})"
set key left top
plot "800/data.dat" u ($1/1000):5 w line lt 1 lw 2 lc rgb "green" title "800"
I found available term:
gnuplot> show terminal
terminal type is x11
gnuplot> set term
Available terminal types:
canvas HTML Canvas object
cgm Computer Graphics Metafile
context ConTeXt with MetaFun (for PDF documents)
domterm DomTerm terminal emulator with embedded SVG
dumb ascii art for anything that prints text
dxf dxf-file for AutoCad (default size 120x80)
emf Enhanced Metafile format
epslatex LaTeX picture environment using graphicx package
fig FIG graphics language V3.2 for XFIG graphics editor
hpgl HP7475 and relatives [number of pens] [eject]
mf Metafont plotting standard
mp MetaPost plotting standard
pcl5 PCL5e/PCL5c printers using HP-GL/2
pict2e LaTeX2e picture environment
postscript PostScript graphics, including EPSF embedded files (*.eps)
pslatex LaTeX picture environment with PostScript \specials
pstex plain TeX with PostScript \specials
pstricks LaTeX picture environment with PSTricks macros
svg W3C Scalable Vector Graphics
tek40xx Tektronix 4010 and others; most TEK emulators
tek410x Tektronix 4106, 4107, 4109 and 420X terminals
texdraw LaTeX texdraw environment
tkcanvas Tk canvas widget
unknown Unknown terminal type - not a plotting device
vttek VT-like tek40xx terminal emulator
x11 X11 Window System interactive terminal
xlib X11 Window System (dump of gnuplot_x11 command stream)
xterm Xterm Tektronix 4014 Mode
gnuplot>
Is there anyone dealing with the same error?

Related

pdf2image converts page with page information that is not visible in the pdf viewer

I convert PDF to image using pdf2image which is python package.
But in result, PDF page information(?), which is not visible on pdf viewer, is appeared.
How can i remove page information on PDF, not on image?
PDF file link is https://1drv.ms/b/s!Ar1AW_VI_HwvkMAOyDmQhFEKrZnRWg?e=fvWEwN
The PDF Page data/information for viewing has been constrained by a "crop box" or "trim box" which in most cases would be identical to the paper "media box" However when using crop marks for printing or display the crop box area will be smaller than the media box area.
pdf2image has a setting to cover the use of crop boxes use_cropbox=True, (normal default is False) so in your invocation you would need to set that argument/option
use_cropbox
Uses the PDF cropbox instead of the default mediabox. This is a rather dark feature that should be set to true when the module does not seem to work with your data.
However looking into the file the values have been altered from expected so a source page is defined as
<< /CropBox [ 0 0 676 855] /MediaBox [ 0 0 676 856]...
thus there would be no noticeable difference, the 1 unit is only 1/72"
But 48 pages have later additional (LaTeX ?) crop box values of
<</CropBox[32.4 32.4 643.6 823.6]... and this seems to effect the issue of the trimmed viewport.
pdfinfo filename.pdf reports the cropped area Page size: 611.2 x 791.2 pts (letter)
For that reason (there are two conflicting settings) :-
Then without a working pdf2image set-up for testing, I am not 100% confident that the use_cropbox=True setting may always work reliably.
There are other methods that might work better and Ghostscript and other python dependency applications have similar, or alternate, means to clip the image output directly on the file. Using poppler direct we could get the same default output
However if we specify -cropbox the secondary crop, in this case, will be taken into account.
pdftoppm -png -cropbox "process data sheet.pdf" output
If that did not work we would need to define the exact area using
-x <int> : x-coordinate of the crop area top left corner
-y <int> : y-coordinate of the crop area top left corner
-W <int> : width of crop area in pixels (default is 0)
-H <int> : height of crop area in pixels (default is 0)

How to avoid gray outline artefacts when converting an eps image to pdf?

To generate vector graphics figures with LaTeX labels, I use gnuplot and the cairolatex terminal, creating the image via plot "data.txt" u 1:2:3 matrix with image notitle followed by:
latex figuregen.tex
dvips -E -ofile.eps figuregen
# Correct the bounding box automatically:
epstool --copy --bbox file.eps filename.eps
## Create a pdf:
ps2pdf -dPDFSETTINGS=/prepress -dSubsetFonts=true -dEmbedAllFonts=true -dMaxSubsetPct=100 -dCompatibilityLevel=1.3 -dEPSCrop filename.eps filename.pdf
Here is a zoom on a specific region of the original eps image:
White regions actually correspond to NaN values in the data file.
Now using the pdf file converted from eps:
In the pdf version, there are now unwanted outlines around all the NaN pixels, creating an awful lot of noise in the higher portion of the image.
I want to have these images as pdf, free of artefacts, and preserve high-quality LaTeX labels. I suspect that there might be a ps2pdf option to deactivate this kind of unwanted behaviour but I just cannot find it.
I tried things such as: -dGraphicsAlphaBits=1, -dNOINTERPOLATE, -dALLOWPSTRANSPARENCY, -dNOTRANSPARENCY, -dCompatibilityLevel=1.4 or -dCompatibilityLevel=1.5, but without success.
I also tried fixing this directly in gnuplot, but without success (see e.g. below).
Would any of you know how to solve this issue?
Thank you very much for your time!
EDIT
What's even more surprising and problematic is that these artefacts also appear when printed.
Note however that they do not appear at extreme levels of zoom in evince when only a small part of the data set is plotted.
MWE:
# plot.plt
set size ratio -1
set palette defined ( 0 '#D73027', 1 '#F46D43', 2 '#FDAE61', 3 '#FEE090', 4 '#FFFFD9', 5 '#E0F3F8', 6 '#ABD9E9', 7 '#74ADD1', 8 '#4575B4' )
#set yr [300:0] ### no artefacts if zoom is higher than 1310% in evince
set yr [400:100] ### no artefacts if zoom is higher than 1780% in evince
#set yr [450:0] ### artefacts at all zoom levels if we show more data, or all of it
set term cairolatex dashed color; set output "temp.tex"
plot "data.txt" u 1:2:3 matrix with image notitle
set output #Closes the temporary output file.
!sed -e 's|/Title|%/Title|' -e 's|/Subject|%/Subject|' -e 's|/Creator|%/Creator|' -e 's|/Author|%/Author|' < temp.tex > graph.tex
and, for completeness:
% figuregen.tex
\documentclass[dvips]{article}
\pagestyle{empty}
\usepackage[dvips]{graphicx} %
\begin{document}
\input graph.tex
\end{document}
If needed, part of the data can be found in text form here; enough to reproduce the issue: https://paste.nomagic.uk/?e0343cc8f759040a#DkRxNiNrH6d3QMZ985CxhA21pG2HpVEShrfjg84uvAdt
EDIT 2
In fact, same artefact issues appear when using set terminal cairolatex pdf
set terminal cairolatex standalone pdf size 16cm,10.5cm dashed transparent
set output "plot.tex"
directly with pdflatex
gnuplot<plot.plt
pdflatex plot.tex
(Note, this is using Gnuplot Version 5.2 patchlevel 6).
The actual problem is, that NaN values are set to transparent black pixels (#00000000).
The transparency causes these gray outline artifacts, depending on the zooming level. If you zoom close enough, then you see no artifacts.
But as soon as the image pixels are smaller than your monitor pixels, the values are interpolated for screen display. Its seems that pdf viewers like evince (I tested also okular and mupdf) interpolate both color and alpha channels, so that the alpha value of the Nan pixels is changed, and the underlying black appears as gray border around the color pixels.
I tried several ways. The easiest one, which actually worked for me was to use the tikz terminal with option externalimages which saves images created with image as separate png file.
These png file also contains transparency, and the final result has the same artifacts.
But you can use imagemagick's convert to change the transparent NaN pixels of the png to white with
convert temp.001.png -alpha off -fill white -opaque black temp.001.png
So, a fully working plot file is
# plot.plt
set size ratio -1
set palette defined ( 0 '#D73027', 1 '#F46D43', 2 '#FDAE61', 3 '#FEE090', 4 '#FFFFD9', 5 '#E0F3F8', 6 '#ABD9E9', 7 '#74ADD1', 8 '#4575B4' )
set ytics 100
set yrange reverse
set term tikz standalone externalimages background "white"; set output "temp.tex"
plot "data.txt" u 1:2:3 matrix with image notitle
# temp.001.png is the external image which contains only the 'with image' part
# We must remove the #00000000 color, which represents the NaN pixels
# I couldn't replace the colors directly, but I could first remove the alpha channel
# and then change black to white, because no other black pixel appear
!convert temp.001.png -alpha off -fill white -opaque black temp.001.png
set output #Closes the temporary output file.
!sed -e 's|/Title|%/Title|' -e 's|/Subject|%/Subject|' -e 's|/Creator|%/Creator|' -e 's|/Author|%/Author|' < temp.tex > graph.tex
!pdflatex graph.tex
Mupdf screen shot for graph.pdf:
Note, that I used standalone to be able to directly compile the resulting file, so that I could check the result.
A more cumbersome alternative would be to "manually" plot with image to a png file, and include that in a second plot, like I described in Big data surface plots: Call gnuplot from tikz to generate bitmap and include automatically? Then you can have more influence on how the png is generated.
Just for the records, with image pixels seems to do the "trick" and will create a file without grey surrounding of NaN datapoints. Tested with gnuplot 5.2.6.
plot FILE u 1:2:3 matrix with image pixels notitle
Code:
### avoid shading around NaN datapoints
reset session
set size ratio -1
FILE = "data.txt"
set palette defined ( 0 '#D73027', 1 '#F46D43', 2 '#FDAE61', 3 '#FEE090', 4 '#FFFFD9', 5 '#E0F3F8', 6 '#ABD9E9', 7 '#74ADD1', 8 '#4575B4' )
set term cairolatex dashed color
set output "temp.tex"
plot FILE u 1:2:3 matrix with image pixels notitle
set output
### end of code
Result: (a PNG of a screenshot, since it looks like I cannot add a PDF here)

Ghostscript converting PostScript to PDF seems to ignore the page size / BoundingBox

I have created a PostScript file from a TIFF image using ImageMagick.
The command-line I am using is:
convert input.tif[0] -density 600 -alpha Off -size 5809x9408 -depth 16 intermediate.ps
This takes my input tiff image (just the main image, and not the thumbnail via using [0]) and creates a .ps file from the bitmap.
When I look at the header of my PostScript file, I can see that it has the correct page size:
%!PS-Adobe-3.0
%%Creator: (ImageMagick)
%%Title: (intermediate.ps)
%%CreationDate: (2017-05-22T08:43:44+10:00)
%%BoundingBox: -0 -0 697 1129
%%HiResBoundingBox: 0 0 697.08 1129
%%DocumentData: Clean7Bit
%%LanguageLevel: 1
%%Orientation: Portrait
%%PageOrder: Ascend
%%Pages: 1
%%EndComments
Yet, when I use GhostScript to convert this to a PDF, unless I go to a lot of trouble to specify otherwise, gs is cropping it and putting it on a US Letter sized page.
gs -dPDFA=1 -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sDefaultRGBProfile=AdobeRGB1998.icc -dOverrideICC -sOutputFile=output.pdf -r600 -P PDFA_def.ps intermediate.ps
When I open the resulting PDF, the crop box is 612 x 792 pt wich is US Letter. It should be 697 x 1129 pt, the size of the Bounding Box in the PostScript file.
I have created a custom .joboptions file using Acrobat Distiller that sets image compression and the like, and in this file if I specify the page size at the end, then the resulting PDF comes out the correct size:
<<
/HWResolution [600 600]
/PageSize [697.080 1128.960]
>> setpagedevice
Now this isn't a huge issue for a one-off conversion, but I have to convert a large number of images and I don't want to set the page size manually for every single file.
The lines you quote above are comments and, from the comments present, suggest that this is an EPS file, not a PostScript program.
The main difference is that EPS is 'encapsulated' which means its intended to be placed verbatim inside a PostScript program. The enclosing program contains the intelligence regarding the media size, and arranges to set the context such that the EPS is scaled, rotated, translated so that it fits appropriately on the media.
In order to do this successfully, the EPS file must follow certain rules; in particular it must not set any media size itself (because that would mess with the enclosing program).
So it seems likely to me that what you have is an EPS file which does not request any media size at all. So its hardly surprising that you have to tell Ghostscript what you want to do with it.
Now in order for the enclosing program to place the EPS it needs to know its characteristics, the size and shape of the content. That's what the comments are for. Ordinarily an EPS file is read by an application (eg MS Word, LibreOffice etc) which parses out those comments and uses the information when generating the final PostScript program. The reason an EPS uses comments to store this information is precisely so that it has no effect on the actual content of the EPS and so the entire EPS can be included without further processing by the application.
The short answer is that if you read the Ghostscript documentation here you will find descriptions of the EPSCrop and EPSFitPage command line switches which will do all the work for you.

Import raster by raster2pgsql, but get a sql syntax error

I try to import a raster file into my postgres database following this tutorial [http://www.postgis.org/documentation/manual-svn/using_raster.xml.html][1]
Environment: windows7, Postgres 8.4, postgis 2.0.
My command line are:
cd C:\Program Files (x86)\PostgreSQL\8.4\bin
raster2pgsql -s 4236 -I -G -M kiwi.jpg -F -t 100x100 public.gis > out.sql
psql -U postgres -d mydb2 -f out.sql
The picture named "kiwi" was in the "C:\Program Files (x86)\PostgreSQL\8.4\bin" folder.
The out.sql can be generated successfully. But after input "psql -U postgres -d mydb2 -f out.sql" , there is an error.
psql:out.sql:98: ERROR: syntax error at or near"Available"
LINE 1: Available GDAL raster formats:
Thank you!
This is the contents of the query:(I'm very new about postgis, so I can't figure out what's wrong here. Because I just follow the tutorial, it should work )
Available GDAL raster formats:
Virtual Raster
GeoTIFF
National Imagery Transmission Format
Raster Product Format TOC format
ECRG TOC format
Erdas Imagine Images (.img)
CEOS SAR Image
CEOS Image
JAXA PALSAR Product Reader (Level 1.1/1.5)
Ground-based SAR Applications Testbed File Format (.gff)
ELAS
Arc/Info Binary Grid
Arc/Info ASCII Grid
GRASS ASCII Grid
SDTS Raster
DTED Elevation Raster
Portable Network Graphics
JPEG JFIF
In Memory Raster
Japanese DEM (.mem)
Graphics Interchange Format (.gif)
Graphics Interchange Format (.gif)
Envisat Image Format
Maptech BSB Nautical Charts
X11 PixMap Format
MS Windows Device Independent Bitmap
SPOT DIMAP
AirSAR Polarimetric Image
RadarSat 2 XML Product
PCIDSK Database File
PCRaster Raster File
ILWIS Raster Map
SGI Image File Format 1.0
SRTMHGT File Format
Leveller heightfield
Terragen heightfield
USGS Astrogeology ISIS cube (Version 3)
USGS Astrogeology ISIS cube (Version 2)
NASA Planetary Data System
EarthWatch .TIL
ERMapper .ers Labelled
NOAA Polar Orbiter Level 1b Data Set
FIT Image
GRIdded Binary (.grb)
Raster Matrix Format
EUMETSAT Archive native (.nat)
Idrisi Raster A.1
Intergraph Raster
Golden Software ASCII Grid (.grd)
Golden Software Binary Grid (.grd)
Golden Software 7 Binary Grid (.grd)
COSAR Annotated Binary Matrix (TerraSAR-X)
TerraSAR-X Product
DRDC COASP SAR Processor Raster
R Object Data Store
Portable Pixmap Format (netpbm)
USGS DOQ (Old Style)
USGS DOQ (New Style)
ENVI .hdr Labelled
ESRI .hdr Labelled
Generic Binary (.hdr Labelled)
PCI .aux Labelled
Vexcel MFF Raster
Vexcel MFF2 (HKV) Raster
Fuji BAS Scanner Image
GSC Geogrid
EOSAT FAST Format
VTP .bt (Binary Terrain) 1.3 Format
Erdas .LAN/.GIS
Convair PolGASP
Image Data and Analysis
NLAPS Data Format
Erdas Imagine Raw
DIPEx
FARSITE v.4 Landscape File (.lcp)
NOAA Vertical Datum .GTX
NADCON .los/.las Datum Grid Shift
NTv2 Datum Grid Shift
ACE2
Snow Data Assimilation System
Swedish Grid RIK (.rik)
USGS Optional ASCII DEM (and CDED)
GeoSoft Grid Exchange Format
Northwood Numeric Grid Format .grd/.tab
Northwood Classified Grid Format .grc/.tab
ARC Digitized Raster Graphics
Standard Raster Product (ASRP/USRP)
Magellan topo (.blx)
SAGA GIS Binary Grid (.sdat)
Kml Super Overlay
ASCII Gridded XYZ
HF2/HFZ heightfield raster
OziExplorer Image File
USGS LULC Composite Theme Grid
Arc/Info Export E00 GRID
ZMap Plus Grid
NOAA NGS Geoid Height Grids
I have no idea about this error after searching a lot. I really appreciate that if you can give me some suggestions.
[1]: http://www.postgis.org/documentation/manual-svn/using_raster.xml.html
[2]: http://i.stack.imgur.com/YxjNJ.png
The psql utility in PostgreSQL is for processing SQL commands. The file you show doesn't contain SQL commands, it appears to contain information to help someone choose an option for the raster2pgsql program. A quick web search turned up documentation here:
http://www.postgis.org/documentation/manual-svn/using_raster.xml.html
Notice that the -G option is used to "Print the supported raster formats." The command line you used to run the program included that switch. If your goal is to produce SQL statements, that's not an option you should include. I don't know whether any other adjustments need to be made to your command, but you could start by dropping that and see what you get.

How to get DPI, width and length of an image in PDF in PHP

Suppose a single image is saved as pdf. How can I get DPI, width and length information about the image in PDF file? How can I do it in PHP? Basically I want to retrieve the following information:
On a particular private website I uploaded my pdf and got following informtaion:
Size of input file: 285.81 KB
Import time: 0 sec
Source document has been created with Adobe Photoshop CS4 Macintosh
PDF file has been produced using Adobe Photoshop for Macintosh -- Image Conversion Plug-in
Creation
date of source document is: D:20091002102636+05'30'
Recognized page format at import: Custom (2.997 cm x 5.004 cm)
Document contains 1 page(s).
The following file properties may cause problems:
Page 1 (Image4: Pos. x: 1.499 cm y:2.502 cm, width: 2.997 cm height: 5.004 cm): Resolution of grayscale image too high (found: 300.00 dpi - demanded: 170.00 dpi)
DPI is irrelevant in PDFs themselves. Your concern is with an image embedded within it, and to parse those out easily you will probably want to rebuild with a library that handles the file IO for you, such as http://www.pdflib.com