Why the size of file with cropped image is the same as of initial one? - pdf

I have scanned my copybook and want to crop out extra white regions with Inkscape.
To achieve this, I import initial image (PDF) to Inkscape, draw appropriate rectangle, and use Object->Clip->Set to cut out needed region. Then I resize page to drawing and save obtained page as new PDF file through File->Save a Copy.
I expected that the size of the new PDF file (with cropped image) will be less than the size of the initial PDF (with image without crop), but they are the same.
What is the reason of this and may it be worked around?
I use Inkscape 0.91 at Linux Mint 18.2.
Thank you in advance.

Because the original image is still there, fully intact and with all its contents. The cropping rectangle are just instructions to the PDF viewer to crop out those regions when rendering the image.
However in Inkscape you can bake the crop rectangles and when exporting to PDF "apply raster effects" which should actually alter the contained image(s).

Related

Zooming a picture vs zooming a pdf

Im rendering a pdf using pdf js library. There I can specify zoom (scale) property. Which is fine. I can define pretty high zoom , let's say 8x and still get decent quality of the rendered pdf. However if I were to try to same pdf but converted to graphic image format like jpeg. And then try to render it with high zoom the quality is very bad. Why is that so?
You are describing the difference between vector graphics and raster graphics. A vector graphic format contains contains commands telling how to draw an image. A raster format is an array that tells what the color is at each position in the image.
PDF is largely a raster format (Yes, you can embed a raster image in a PDF). A PDF that has in instruction to draw a line or draw a character can be zoomed to any degree and the drawing will be correct.
In a raster format, if you zoom, eventually you see the individual pixels in the array and they cannot be zoomed any more without distortion. Text in a JPEG or PNG file becomes jagged as you zoom.
On the other hand, try to create a photographic quality image just with drawing commands and you would get huge files.

Creating PDF from a single JPG file using Ghostscript - image placement issue inside PDF

I'm trying to output a PDF file from a JPG file using Ghostscript. The following command works fine:
gs -sDEVICE=pdfwrite -sPAPERSIZE=a4 -o /pdf_from_image.pdf /path/to/viewjpeg.ps -c \(/source_image.jpg\) viewJPEG
Based on existing threads and Ghostscript documentation I'm using -sPAPERSIZE=a4 to generate the output in A4 format. The PDF generates fine, but the PROBLEM is when the image dimensions don't match that of A4, GS puts the image at the bottom of the page with best "width" fit. I think it actually tries to put it in the lower left bottom. To add to it, at times the image is auto rotated.
My question is:
1) Is there any option to put the image on top left corner of the page.
2) Stop GS auto rotating the image.
Any help to put me in the right direction would be greatly appreciated. Thanks.
PDF and PostScript use a coordinate system with the origin (0,0) in the lower left corner, so Ghostscript is actually doing the 'correct' thing: putting the image at the origin. To place the image at the top, you'd have to subtract the image height from the page height and translate the image upwards by that amount.
As for why some images are being rotated, I can't say for sure. Some JPGs contain metadata that indicates the intended orientation of the image--however, not all software gets the value right. I don't know if Ghostscript respects that metadata, but you could check if your 'bad' images have the correct orientation tag (you can use Exif or similar to inspect them).

Combining two resized (matched sizes) images together with image magick

I have two images I want to combine together with image magick. The way I have it setup right now is a form will upload an image to a server, resize the image to make a smaller version, then save both to a folder. I have my code setup correctly for this using image magick and it works just fine.
What I'm looking to accomplish is sort of like annotating it, however I'd like to do this by appending a header image to the top of the file. I know I can accomplish this using the -append flag. What I'm confused on is how I can match the image sizes so I don't need to do several resizes.
I'm making the resizing occur only if the uploaded files exceed 1000x1000 (using the -resize 1000x1000> argument), but this doesn't guarantee that all files will be 1000px wide. I've made the header image 1000 pixels wide, and when the image is at 1000px, appending those is no problem.
My problem is deciphering how it should be handled when it's smaller than that. Do I resize the header image to the other images size, then append? I know it'd be better to avoid scaling the image up to 1000px, then appending the header and scaling back down, as that would affect image quality. Can I resize the header image without writing it to a temporary file? Like, chain the events and only end up writing a single completed file to disk?

Discrepany between PDF cropbox and SVG created out of a PDF page

I am trying to extract the background image of a PDF page to an SVG (using xpdf library). The problem I am facing is that the PDF contains additional images/graphics (presumably outside the cropbox) that are not rendered by PDF readers, but the corresponding SVG contains these images/graphics. I tried setting the viewBox attribute of the SVG to correspond to the cropBox bounds of that PDF page but the resulting SVG still displays some of the graphics objects that are not rendered by PDF. I also tried adding a clip path to the SVG - a rectangular clipping region (with bounds corresponding to PDF cropbox), but this too did not eliminate some of the additional graphics elements no seen in PDF. Any idea on what could be the problem? What is the right way to carry over PDF cropbox to SVG? Btw, the SVGs generated in both the cases mentioned above (viewbox and clipping region approaches) were fairly close in dimensions to the viewable area of the PDF page, and the additional elements were seen only close to the edges. Is it that cropbox dimensions obtained from PDF should not be used directly in SVG?
Turns out that the problem was due to my code not transforming the PDF cropbox attribute (as given by xpdf) to user coordinates using CTM matrix (also obtainable through xpdf). After applying the transformation, the resulting SVG matches the rendered portion of the PDF page.

Getting the cropping and rotation information of an image in a PDF

I have a PDF with a page with an image. I'm using a command line tool to extract this image. The page in the PDF shows only a part of the image, because the extracted image as a lot more "contents" and they are slightly rotated. This happens, I assume, because some sort of cropping and/or rotation was applied to the image when the PDF was built.
Is there anyway, using iText, to figure out the offset and rotation applied to the image? That would allow me to crop the extracted image in the same way and end up with something similar to what's visible on the PDF page.