I have the following command line setup to create a jpeg copy of each page of a pdf, cropping it down to the trim size. Actually i'm wanting the bleed size which is 3mm added to each edge of the trim box
-o newbitmap_%04d.jpg -sDEVICE=jpegcmyk -dJPEGQ=60 -r150 -dSimulateOverprint=false -dUseTrimBox original.pdf
Please could someone suggest how I can do this?
Thanks
David
In the current Ghostscript code, use -dUseBleedBox. See :
http://bugs.ghostscript.com/show_bug.cgi?id=694977
Related
I use the following command to convert a PDF to EPS:
gswin32 -dNOCACHE -dNOPAUSE -dBATCH -dSAFER -sDEVICE=epswrite -dLanguageLevel=2 -sOutputFile=test.eps -f test.pdf
I then use the following command to convert the EPS to another PDF (test2.pdf) to view the EPS figure.
gswin32 -dSAFER -dNOPLATFONTS -dNOPAUSE -dBATCH -dEPSCrop -sDEVICE=pdfwrite -dPDFSETTINGS=/printer -dCompatibilityLevel=1.4 -dMaxSubsetPct=100 -dSubsetFonts=true -dEmbedAllFonts=true -sOutputFile=test2.pdf -f test.eps
I found the text in the generated test2.pdf have been converted to outline curves. There is no font embedded anymore either.
Is it possible to convert PDF to EPS without convert text to outlines? I mean, to EPS with embedded font and text.
Also after the conversion (test.pdf -> test.eps -> test2.pdf), the height and width of the PDF figure (test2.pdf) is a little bit smaller than the original PDF (test.pdf):
test.pdf:
test2.pdf:
Is it possible to keep the width and height of the figure after conversion?
Here is the test.pdf: https://dl.dropboxusercontent.com/u/45318932/test.pdf
I tried KenS's suggestion:
gswin32 -dNOPAUSE -dBATCH -dSAFER -sDEVICE=eps2write -dLanguageLevel=2 -sOutputFile=test.eps -f test.pdf
gswin32 -dSAFER -dNOPLATFONTS -dNOPAUSE -dBATCH -dEPSCrop -sDEVICE=pdfwrite -dPDFSETTINGS=/printer -dCompatibilityLevel=1.4 -dMaxSubsetPct=100 -dSubsetFonts=true -dEmbedAllFonts=true -sOutputFile=test2.pdf -f test.eps
I can see the converted test2.pdf have very weird font:
that is different from the original font in test.pdf:
When I copy the text from test2.pdf, I only get a couple of symbols like:
✕ ✖ ✗✘✙ ✚✛
Here is the test2.pdf: https://dl.dropboxusercontent.com/u/45318932/test2.pdf
I was using the latest Ghostscript 9.15. So what is the problem?
I just noticed you are using epswrite, you don't want to do that. That device is terrible and has been deprecated (and removed now). Use the eps2write device instead (you will need a relatively recent version of Ghostscript).
There's nothing you can do with epswrite except throw it away, it makes terrible EPS files. It also can't make level 2 files, no matter what you set -dLanguageLevel to
oh, and don't use -dNOCACHE, that prevents fonts being processed and decomposes everything to outlines or bitmaps.
UPDATE
You set subset fonts to true. By doing so the character codes which are used are more or less random. The first glyph in the document (say for example the 'H' in 'Hello World') gets the code 1, the second one (eg 'e') gets the code 2 and so on.
If you have a ToUnicode CMap, then Acrobat and other readers can convert these character codes to Unicode code points, without that the readers have to fall back on heuristics, the final one being 'treat it as ASCII'. Because the encoding arrangement isn't ASCII, then you get gibberish. MS Windows' PostScript output can contain additional ToUnicode information, but that's not something we try to mimic in ps2write. After all, presumably you had a PDF file already....
Every time you do a conversion you run the risk of this kind of degradation, you should really try and minimise this in your workflow.
The problem is even worse in this case, the input PDF file has a TrueType CID Font. Basic language level 2 PostScript can't handle CIDFonts (IIRC this was introduced in version 2015). Since eps2write only emits basic level 2 it cannot write the font as a CIDFont. So instead it captures the glyph outlines and stores them in a type 3 font.
However, our EPS/PS output doesn't attempt to embed ToUnicode information in the PostScript (its non-standard, very few applications can make use of it and it therefore makes the files larger for little benefit). In addition CIDFonts use multiple (2 or more) bytes for the character code, so there's no way to encode the type 3 fonts as ASCII.
Fundamentally you cannot use Ghostscript to go PDF->PS->PDF and still be able to copy/paste/search text, if the input contains CIDFonts.
By the way, there's no point in setting -dLanguageLevel at all. eps2write only creates level 2 output.
I used Inkscape To convert a .pdf to .EPS. Just upload the .pdf file to Inkscape, in the options to open chose high mesh, and save as . an EPS file.
I have a PDF, which I want to convert to PCL
I convert PDF to PCL using the following command:
(gs 8.70)
gswin32c.exe -q -dNOPAUSE -dBATCH \
-sDEVICE=ljetplus -dDuplex=false -dTumble=false \
-sPAPERSIZE=a4 -sOutputFile="d:\doc1.pcl" \
-f"d:\doc1.pdf" -c -quit
When I view or print the output PCL, it is cropped. I would expect the output to start right at the edge of the paper (at least in the viewer).
Is there any to way get the whole output without moving the contents of the page away from the paper edge?
I tried the -dPDFFitpage option which works, but results in a scaled output.
You are using -sPAPERSIZE=a4. This causes the PCL to render for A4 sized media.
Very likely, your input PDF is made for a non-A4 size. That leaves you with 3 options:
...you either use that exact page size for the PCL too (which your printer possibly cannot handle),
...or you have to add -dPDFFitPage (as you tried, but didn't like),
...or you skip the -sPAPERSIZE=... parameter altogether (which most likely will automatically use the same as size as the PDF, and which your printer possibly cannot handle...)
Update 1:
In case ljetplus is not a hard requirement for your requested PCL format variant, you could try this:
gs -sDEVICE=pxlmono -o pxlmono.pcl a4-fo.pdf
gs -sDEVICE=pxlcolor -o pxlcolor.pcl a4-fo.pdf
Update 2:
I can confirm now that even the most recent version of Ghostscript (v9.06) cannot handle non-Letter page sizes for ljetplus output.
I'd regard this as a bug... but it could well be that it won't be fixed, even if reported at the GS bug tracker. However, the least that can be expected is that it will get documented as a known limitation for ljetplus output...
I have page scans of various sizes in JPG format which I convert to a single PDF using ImageMagick. However I noticed every PDF page for each type of scan produces a different size PDF page, even if I use -page A4 option on ImageMagick. I would every JPG, in whatever size to "fill" each PDF page, and every PDF page to be the same. I also have access to tools like pdftk, pdfjam.
Any ideas?
As a hack you can use pdflatex and the wallpaper package. It does the trick and has the advantage over most other methods of not altering the image content (resolution, compression, pixel content) and adds only about 1.2kB of overhead.
To keep the aspect ratio, use:
filename=test.jpg;
echo "\documentclass[a4paper]{article}\
\usepackage{wallpaper}\usepackage{grffile}\
\begin{document}\
\thispagestyle{empty}\
\ThisCenterWallPaper{1}{$filename}~\
\end{document}"\
| pdflatex --jobname "$filename";
rm "$filename".aux "$filename".log
To fill the page completely, use:
filename=test.jpg;
echo "\documentclass[a4paper]{article}\
\usepackage{wallpaper}\usepackage{grffile}\
\begin{document}\
\thispagestyle{empty}\
\ThisTileWallPaper{\paperwidth}{\paperheight}{$filename}~\
\end{document}"\
| pdflatex --jobname "$filename";
rm "$filename".aux "$filename".log
Finally you can concatenate your pages using pdftk
pdftk page1.pdf ... page2.pdf cat out final_document.pdf
When I used -density 50% on ImageMagick convert it managed to zoom lower res images to bigger PDF pages.
This should do the trick, assuming your PDF output should have A4 sized pages (portrait):
convert -scale 595x842\! *.jpg output.pdf
I am trying to convert a PDF doc whose page size is A4 wide but several cm's short of being A4 long into Tiff.
Using GS I can happily convert it to A4 TIFF but the image is padded at the top with a 3cm white space.
This leaves a rather ugly white banner at the very top.
Is there anyway to get GS to pad at the bottom of a page not the top?
I am using GS 9.04 on Linux and use the following conversion command.
gs -q -sDEVICE=tiffg4 -dBATCH -dNOPAUSE -dPDFFitPage -sPAPERSIZE=a4 -dFIXEDRESOLUTION -sOutputFile=x.pdf y.pdf
I see the same problem if I do just a pdf to pdf resize conversion
gs -dQUIET -dNOPAUSE -dBATCH -sPAPERSIZE=a4 -sDEVICE=pdfwrite -sOutputFile=x.pdf -dPDFFitPage y.pdf
Many thanks
The problem is that the image isn't 'padded' in the original PDF document. The 'padding' isn't applied by Ghostscript, its present in the original PDF file, the white space at the top of the PDF page is rendered as white space in the TIFF file.
You can set the media size to be the size you need (and set -dFIXEDMEDIA, so that it doesn't get changed) and then render the file. The 'white space' will then fall off the top of the media, and won't get rendered. You'll have to figure out what the MEDIAHEIGHT parameter needs to be of course.
If you do this, don't set -dPDFFitPage, as this will scale the whole page down to fit the new media size, defeating the purpose of changing the media size.....
So, assuming that the PDF file has a MediaBox of A4, but supplied a 'less than A4' CrtopBox, then you want to set -dUseCropBox. And still not set -dFitPage.
You should also not set -sPAPERSIZE.
If this is not the case then I'm going to have to see an example PDF file.
I am having some trouble with ImageMagick.
I have installed GhostScript v9.00 and ImageMagick-6.6.7-1-Q16 on Windows 7 - 32Bit
When I run the following command in cmd
convert D:\test\sample.pdf D:\test\pages\page.jpg
only the first page of the pdf is converted to pdf. I have also tried the following command
convert D:\test\sample.pdf D:\test\pages\page-%d.jpg
This creates the first jpg as page-0.jpg but the other are not created.
I would really appreciated if someone can shed some light on this. Thanks.
UPDATE:
I have ran the command using -debug "All"
one of the many lines out put says:
2011-01-26T22:41:49+01:00 0:00.727 0.109u 6.6.7 Configure Magick[5800]: nt-base.c/NTGhostscriptGetString/1008/Configure
registry: "HKEY_CURRENT_USER\SOFTWARE\GPL Ghostscript\9.00\GS_DLL" (failed)
Could it maybe have something to do with GhostScript after all?
You can specify which page to convert by putting a number in [] after the filename:
convert D:\test\sample.pdf[7] D:\test\pages\page-7.jpg
It should have, however, converted all pages to individual images with your command.
By the way if you need to convert first and second pages then provide in array comma separated values
convert D:\test\sample.pdf[0,1] D:\test\pages\page.jpg
Resulting JPEG files will be named:
for page 1: page-0.jpg
for page 2: page-1.jpg
You can also do
convert D:\test\sample.pdf[10,15,20-22,50] D:\test\pages\page.jpg
Resulting JPEG files will be named:
for page 11: page-10.jpg
for page 16: page-15.jpg
for page 21: page-20.jpg
for page 22: page-21.jpg
for page 23: page-22.jpg
for page 51: page-50.jpg
May be it will help to someone.
According to the site admin at the ImageMagick forum:
ImageMagick uses the pngalpha device when it finds an Adobe
Illustrator PDF. Many of these are a single page. Ideally, Ghostscript
would support a device that allows multiple PDF pages with
transparency but it doesn't...
Easy fix. Edit delegates.xml and change pngalpha to pnmraw.
This worked for me. I don't know if it introduces any other problems however.
See this post from their forums.
I found this solution which convert all pages in the pdf to a single jpg image:
montage input.pdf -mode Concatenate -tile 1x output.jpg
montage is included in ImageMagick.
Tested on ImageMagick 6.7.7-10 on Ubuntu 13.04.
I ran into similar problem with GhostScript. This can be solver with using %03d iterator in the output file name. Here is example:
gs -r300 -dNOPAUSE -dBATCH -sDEVICE#pngalpha -sOutputFile=output-%03d.png input.pdf
Here is the reference with detailed information: https://ghostscript.com/doc/current/Devices.htm