Ghostscript - Specify output size when creating png from pdf - pdf

I've read a few threads on this general subject, but they don't seem to help so I'm asking as a new question.
I'm currently creating a process to use ghostscript from vb.net (I've had some issues with parameters with Ghostscipt.net, but external process works well enough for me)
I'm capturing a small square area from a specific page of a pdf and converting to a png. There's an offset involved as the area is not on the left/bottom edge, so I'm using 200x200 for the size and -220, 206 for x/y offset and I've specified page 1 for this example, as below:
" -dBATCH -dNOPAUSE -sDEVICE=pnggray -r600 -dDEVICEWIDTHPOINTS=200 -dDEVICEHEIGHTPOINTS=200 -dFIXEDMEDIA -dFirstPage=1 -dLastPage=1 -SOutputFile="mypath\output.png" -c "<</PageOffset [-220 206]>> setpagedevice" -f "mypath\input.pdf"
This works fine, but I next need to restrict the output png file to 600 x 600 pixels, regardless of input size.
All the solutions I've found seem to be related to changing device width/height points, but I use those to specifiy the area I want to capture, so I'm confused as to how this should work.
Can anyone help?
Sorry I couldn't tag this for Ghostscript - my reputation isn't high enough!
Thanks

Thanks K J for your comments, very helpful. I've been testing an alternative library for decoding the QR codes - ZXing.net - and this seems not to be so fussy about resolution and image size, so I think I've gotten around the problem that way.
I've tried it with both 300 and 600 resolution files of different sizes and it's worked well, so I'd recommend it to anyone with this sort of requirement!

Related

Losing color when rasterizing some vectors

I have a process that rasterize PDFs with lots os vector objects in form of clothes pieces into high definition PNGs using Ghostscript in python. In some cases I'm seeing that some of these pieces are losing the color in the output. My believe its that its related to the fact that there are lots (maybe thousands?) of vectors in each piece.
You can notice that the image on the left (opened in illustrator) has a piece that's not showing in the rasterized image (right). I may be wrong and it's not the quantity of vectors that are affecting but I don't have another lead at the moment since there is not error message or anything. Why could this be happening?
Ghostcript version: Ghostscript 9.53.3
Command I'm using to run it:
gs -sDEVICE=pngalpha -r300 -o output_file_path -dBufferSpace=2000000000 -dNumRenderingThreads=4 -dMaxBitmap=2147483647 -f input_file_path

How can I renderize a PDF into BMP fitting content to PDF page boundaries?

I am getting a BMP from a PDF with GhostScript, but its content is not fitted into page boundaries. Even I try any option, I am not able to get the content fitted.
I've tried to generate the BMP with different GhostScript options, but noone seems to fit 100% ok the content.
This is the last command I tried. Please, don't expect it to have what I need, just copied & paste from tty.
gs -dBATCH -dNOPAUSE -sPAPERSIZE=a4 -dFIXEDMEDIA -dPSFitPage -sDEVICE=bmpmono -sOutputFile=Betlem.bmp -g1184x968 -c "<</PageSize [900 500]>> setpagedevice 0 0 translate" -c "<</PageOffset [-23 -100]>> setpagedevice" -f Betlem.pdf
I am expecting to get the content fitted into the BMP image borders, without exception of a pixel. I am using an OpenCV & Python function to extract content and fit in new image and this is the debug:
initial BMP image resolution = (872, 900)
BMP image resolution after fit content into new page = (541, 870)
Have a look to the following thread for the fitting funtion in Python:
I can't find a way to fit contour on new image zero point
You are using PSFitPage for a PDF file, you should be using PDFFitPage or just FitPage.
Note that the 'fitting' in this case is fitting the PDF media size to the existing media. If the PDF content leaves white space around the edge of the media, then the resulting scaling will include that.
In addition you are using PostScript to offset the page origin, which will introduce white space, and you are trying to change the media size, which won't work because you've set -dFIXEDMEDIA. Using these in combination with any of the FitPage switches is not likely to work well.
Randomly stabbing at controls and copying bits of code intended to solve different problems isn't likely to help you I'm afraid.
Without seeing an example file I can't, of course, tell you how to solve your problem, and I'm not really sure exactly what you are trying to achieve. A bitmap with no white space ? A bitmap of a given size with no white space ? Something else ?
[Edit]
OK so looking at the PDF file, the media box is 11.69x8.27 inches, there is white space at the top, bottom, left and right between the marks on the page and the edge of the media.
Running this through Ghostscript, to TIFF at 72 dpi results in a file which Adobe Photoshop says is 11.694x8.264 inches and has white space at top bottom left and right, just like the PDF file.
By default Ghostscript uses the Media size from the PDF to render to, however you can change this. If you were to change the media size to (say) 5.8x4.14 inches, set -dFIXEDMEDIA and then rendered the PDF file what would happen is that the top and right hand side of the PDF file would be 'off the page' so you would only get the left hand portion rendered. Try this:
gs -DEVICEWIDTHPOINTS=421 -dDEVICEHEIGHTPOINTS=298 -dFIXEDMEDIA "A betlem m en vull anar(1).pdf"
You will see the white space is still present at bottom and left, and the top and right have fallen off the page.
Now, if you add FitPage that will scale the original media down until it fits the new media size (and all the content too, of course). If you try:
gs -DEVICEWIDTHPOINTS=421 -dDEVICEHEIGHTPOINTS=298 -dFIXEDMEDIA -dFitPage "A betlem m en vull anar(1).pdf"
You'll see that the output is the same physical dfimensions as the previous command, but now the whole of the PDF content can be seen because its been scaled down. You should also see that the distribution of white space has changed, because I didn't strictly divide by 2 in each direction. The FitPage switch scaled the content in both directions by the same amount, and distributed the extra space in the x direction evenly to each side, as new white space.
Now I've no clue what you mean by 'simmetric'. You can undoubtedly do what you want using Ghostscript and the PostScript language, but I don't know what it is you want. Pointing me at Python code isn't going to help I'm afraid, I don't speak Python.
I can say that Ghostscript does not add extra white space that isn't present in the original unless you mess with the rendering by addding parameters like FitPage and FIXEDMEDIA.
If you can explain what you are trying to achieve I can probably tell you what to do.

Ghostscript generate quality thumbnail jpg from pdf

I am generating .jpg thumbnails out of .pdf pages with ghostscript.
This is the code I'm using:
gswin64c -dNumRenderingThreads=4 -dNOPAUSE -sDEVICE=jpeg -g125x175 -
dPDFFitPage -sOutputFile=./h%d.jpg -dJPEGQ=100 -r300 -q input.pdf -c quit
Everything is fine except the quality of thumbnails is really bad. I'm hoping for some ghostscript command to increase the quality to imagemagick quality.
Btw. Imagemagick generates good quality thumbnails, but it's too slow.
Here is an example thumbnail with ghostscript:
And here is the image I want. Generated by imagemagick:
It would be helpful to supply the original file, without that its speculation as regards better parameters.
Personally I wouldn't use JPEG, I doubt it offers much compression at such low resolution/media size. It also doesn't perform well on linework and text, which is what your page looks like to me. The combination leads to considerable artefacts in the output.
The ImageMagick output appears to be heavily anti-aliased, you can get that from Ghostscript by setting -dGraphicsAlphaBits, -dTextAlphaBits OR by oversampling the resolution and then downsampling, using -dDownScaleFactor.
Of course, the performance of Ghostscript when producing anti-aliased output will be reduced compared to the normal output. You can't get something for nothing 'better quality' is going to cost you somewhere along the line.
Note that at the page size you are using -dNumRenderingThreads will have no effect whatsoever. You have to be running a display list for that to have any effect, and such a tiny page will be rendered as a bitmap in memory.

Scale to fit and resize page

I am using (PDF)LaTeX to make a document, and I also need to embed already existing PDF documents in it. The problem is that I have PDF documents in several different page sizes (letter, a4, etc) and I want to compile all of them into a single b5 PDF document.
If I use the pdfpages package from CTAN, all hyperlinks from the original PDFs are removed. So I tried to do it with GhostScript.
This sounds like something normal to do but I have failed to find a working solution.
I have, in the meanwhile, read a few question and answers, but failed to figure out what I am doing wrong and what I am missing.
This doesn't seem to address my problem of scaling.
Neither does that.
This seems to go in the right direction but I couldn't make use of the information :-(.
To make the problem easier, let's just try to resize a single PDF so that:
its contents are scaled to fit the page
the new page has the size I want
Sounds easy, and it is easy to do, for example with pdfjam:
pdfjam --outfile b5-foo.pdf --paper b5paper foo.pdf
Now the problem with this is that pdfjam throws away hyperlinks. From its website:
A potential drawback of pdfjam and other scripts based upon it is that any hyperlinks in the source PDF are lost.
This must be because it seems to use pdfpages mentioned above.
Unlike pdfjam, GhostScript keeps hyperlinks. However, it either:
crops the original when I downscale; or
does not put the scaled content on a page of the size I need -- instead, I get a page that seems to be scaled down, while keeping the original aspect ratio.
This is what I have installed:
$ gs --version
9.21
(Installed on Linux)
This is how I can use GhostScript to crop the content:
gs -dBATCH -dNOPAUSE \
-sDEVICE=pdfwrite -dFIXEDMEDIA -sPAPERSIZE=isob5 \
-o b5-foo.pdf foo.pdf
... and here is how I can use -dPDFFitPage to scale the content but also keep the aspect ratio of the original page size:
gs -dBATCH -dNOPAUSE \
-sDEVICE=pdfwrite -dFIXEDMEDIA -sPAPERSIZE=isob5 -dPDFFitPage \
-o b5-foo.pdf foo.pdf
To be even clearer: I seem to get a page that is scaled so that it would fit inside the b5 I am asking for, but it is not b5: it still has the H/W ratio the original (letter) had!
I'd be happy if this can be done just using switches but if I need to use PostScript that's perfectly fine.
The solution seems to be to use -dPSFitPage instead of -dPDFFitPage. This might have something to do with the PDF files that I am trying to resize. Unfortunately, I cannot share those :-(. When I tried to reproduce this with files that I generated and the problem does not reproduce. I don't know why this is or how I should have known it.
To summarize, using PDF files for both input and output:
-dFitPage and -dPDFFitPage give me scaled pages with the original aspect ratio
-dPSFitPage gives me scaled content on the page size I request with -sPAPERSIZE="$PAPERSIZE"
This seems to go against what the documentation says.

Resize many PDFs

I have many (around 1000) multiple-page PDFs for a program I am writing.
The problem is that many of them are inconsistent about page size, even within the same document at times. Does anyone know of a way I could programmatically go through the files and resize the pages to what I want? This can be in any language.
I can accomplish this in Adobe Acrobat Pro, but there are so many that would end up taking a long, long time. The only way I can get it to resize there is to add a background from a file, and then choosing the file i want to resize.
Generally, PDFtk is a good fit for this kind of problems. It will let you pull everything apart, and reorder/resize/modify pages on the command line.
I had a similar problem and could easily solve it with PDF split and merge, a Java based toolkit for editing PDF files.
You can resize a PDF with a command line tool, Ghostscript.
Assuming you want to resize a PDF to 306x396 points (which would give you a quarter of a letter sized pages), do it like this:
gs \
-o 306x396-points.pdf \
-sDEVICE=pdfwrite \
-g3060x3960 \
-dPDFFitPage \
-dUseCropBox \
input.pdf
Note that the -g.... dimensions are in pixels. Because Ghostscript internally computes with 720 PPI by default, these are increased by a factor of 10 as compared to the sizes in points.
For Windows, use gswin32c.exe or gswin64c.exe instead of gs.