Why is apache pdfbox ImageUtil.getRotatedImage method increasing the size of buffered image? - pdfbox

I am using getRotatedImage function of ImageUtil class of apache pdfbox library to rotate a Buffered Image. The output is perfectly rotated image provided with Buffered image and degree of rotation. The only problem is that the output image when converted to pdf is more than double the size of original image, i.e 1.7MB image is converted to 4.3MB pdf.
Is there any problem with ImageUtil.getRotatedImage method of pdfbox library.
The format of image was .tif and i used LosslessFactory.createFromImage(doc, bim) to create PDImageXObject object.

Related

Why the size of file with cropped image is the same as of initial one?

I have scanned my copybook and want to crop out extra white regions with Inkscape.
To achieve this, I import initial image (PDF) to Inkscape, draw appropriate rectangle, and use Object->Clip->Set to cut out needed region. Then I resize page to drawing and save obtained page as new PDF file through File->Save a Copy.
I expected that the size of the new PDF file (with cropped image) will be less than the size of the initial PDF (with image without crop), but they are the same.
What is the reason of this and may it be worked around?
I use Inkscape 0.91 at Linux Mint 18.2.
Thank you in advance.
Because the original image is still there, fully intact and with all its contents. The cropping rectangle are just instructions to the PDF viewer to crop out those regions when rendering the image.
However in Inkscape you can bake the crop rectangles and when exporting to PDF "apply raster effects" which should actually alter the contained image(s).

Zooming a picture vs zooming a pdf

Im rendering a pdf using pdf js library. There I can specify zoom (scale) property. Which is fine. I can define pretty high zoom , let's say 8x and still get decent quality of the rendered pdf. However if I were to try to same pdf but converted to graphic image format like jpeg. And then try to render it with high zoom the quality is very bad. Why is that so?
You are describing the difference between vector graphics and raster graphics. A vector graphic format contains contains commands telling how to draw an image. A raster format is an array that tells what the color is at each position in the image.
PDF is largely a raster format (Yes, you can embed a raster image in a PDF). A PDF that has in instruction to draw a line or draw a character can be zoomed to any degree and the drawing will be correct.
In a raster format, if you zoom, eventually you see the individual pixels in the array and they cannot be zoomed any more without distortion. Text in a JPEG or PNG file becomes jagged as you zoom.
On the other hand, try to create a photographic quality image just with drawing commands and you would get huge files.

PNG image format is showing small in PDF output

I am transforming to PDF output using a ditamap file . I am trying to insert a png icon instead of gif in Note icon. After the transformation the result is small in png . It is working fine in svg and gif icons. I am using DITA-OT for the transformation.
Suggest me the solution of this .
There is a small png icon beside Text Note.My png icon is normal ie 24x24 px.

Discrepany between PDF cropbox and SVG created out of a PDF page

I am trying to extract the background image of a PDF page to an SVG (using xpdf library). The problem I am facing is that the PDF contains additional images/graphics (presumably outside the cropbox) that are not rendered by PDF readers, but the corresponding SVG contains these images/graphics. I tried setting the viewBox attribute of the SVG to correspond to the cropBox bounds of that PDF page but the resulting SVG still displays some of the graphics objects that are not rendered by PDF. I also tried adding a clip path to the SVG - a rectangular clipping region (with bounds corresponding to PDF cropbox), but this too did not eliminate some of the additional graphics elements no seen in PDF. Any idea on what could be the problem? What is the right way to carry over PDF cropbox to SVG? Btw, the SVGs generated in both the cases mentioned above (viewbox and clipping region approaches) were fairly close in dimensions to the viewable area of the PDF page, and the additional elements were seen only close to the edges. Is it that cropbox dimensions obtained from PDF should not be used directly in SVG?
Turns out that the problem was due to my code not transforming the PDF cropbox attribute (as given by xpdf) to user coordinates using CTM matrix (also obtainable through xpdf). After applying the transformation, the resulting SVG matches the rendered portion of the PDF page.

Getting the cropping and rotation information of an image in a PDF

I have a PDF with a page with an image. I'm using a command line tool to extract this image. The page in the PDF shows only a part of the image, because the extracted image as a lot more "contents" and they are slightly rotated. This happens, I assume, because some sort of cropping and/or rotation was applied to the image when the PDF was built.
Is there anyway, using iText, to figure out the offset and rotation applied to the image? That would allow me to crop the extracted image in the same way and end up with something similar to what's visible on the PDF page.