I have a PDF document that is created by creating NSImages with size in 72dpi pts, each has a single representation which is measured in pixels. I then put these images into PDFPages with initWithImage, and then save the document.
When I open the document, I need the resolution of the original image. However, all of the rectangles that PDFPage gives me are measured in points, not pixels.
I know that the information is in there, and I suppose I can try to parse the PDF data myself, by going through the voyeur.app example... but that's a WHOLE lot of effort to do something that should be pretty normal...
Is there an easier way to do this?
Added:
I've tried two techniques:
get the PDFRepresentation data from
the page, and use it to make a new
NSImage via initWithData. This
works, however, the image has both
size and pixel size in 72dpi.
Draw the PDFPage into a new
off-screen context, and then get a
CGImage from that. The problem is
that when I'm making the context, it
appears that I need to know the size
in pixels already, which defeats
part of the purpose...
There are a few things you need to understand about PDF:
The PDF Coordinate system is in
points (1/72 inch) by default.
The PDF Coordinate system is devoid of resolution. (this is a white lie - the resolution is effectively the limits of 32 bit floating point numbers).
Images in PDF do not inherently have any resolution attached to them (this is a white lie - images compressed with JPEG2000 still have resolution in their embedded metadata).
An Image in PDF is represented by an object that contains a series of samples that are stored using some compression filter.
Image objects can be rendered on a page multiple times at any size.
Since resolution is defined as the number of pixels (or samples) per unit distance, resolution only means something for a particular rendering of an image on a page. So if you are rendering a particular image to fill the page, then the resolution in dpi is
xdpi = image_width / (pageWidthInPoints / 72.0);
ydpi = image_height / (pageHeightInPoints / 72.0);
If the image is not being rendered to the full size of the page, a complete solution is very tricky. Adobe prescribes that images should be treated as being 1x1 and that you change the page transformation matrix to determine how to render them. The means that you would need the matrix at the point of rendering the image and you would need to push the points (0,0), (0, 1), (1,0) through the matrix. The Euclidean distance between (0, 0)' and (1, 0)' will give you the width in points and the Euclidean distance between (0, 0)' and (0, 1)' will give you the height in points.
So how do you get that matrix? Well, you need the content stream for the page and you need to write a PDF interpreter that can rip the content stream and keep track of changes to the CTM. When you reach your image, you extract the CTM for it.
To do that last step should be about an hour with a decent PDF toolkit, provided you are familiar with the toolkit. Writing that toolkit is several person years of work.
Related
Im rendering a pdf using pdf js library. There I can specify zoom (scale) property. Which is fine. I can define pretty high zoom , let's say 8x and still get decent quality of the rendered pdf. However if I were to try to same pdf but converted to graphic image format like jpeg. And then try to render it with high zoom the quality is very bad. Why is that so?
You are describing the difference between vector graphics and raster graphics. A vector graphic format contains contains commands telling how to draw an image. A raster format is an array that tells what the color is at each position in the image.
PDF is largely a raster format (Yes, you can embed a raster image in a PDF). A PDF that has in instruction to draw a line or draw a character can be zoomed to any degree and the drawing will be correct.
In a raster format, if you zoom, eventually you see the individual pixels in the array and they cannot be zoomed any more without distortion. Text in a JPEG or PNG file becomes jagged as you zoom.
On the other hand, try to create a photographic quality image just with drawing commands and you would get huge files.
I am trying to overlay a part of one image on top of another image on .net core (code needs to be cross platform).
I considered using ImageSharp since it supports win,mac and linux.
But i couldn't find pixel blending on their features list, although i saw that you can access an individual pixel.
So the use case would be, i have two 4k Png images and i want a small part of the first image (roughly 10% square of the overall image) to be overlayed on top of the second image (but not the whole image just the same 10% space) and get the area where the merging happened as a new Jpeg image.
(the source PNGs have some degree of transparancy).
I considered cropping out the two parts i want to merge from the two 4k images and then blending them to get the final image, but that is slow for the needs of the project I'm working on.
ImageSharp does support pixel blending, you can specify the pixel blending mode during Draw/Fill operations by passing in an GraphicsOptions parameter and setting its BlenderMode and BlendPercentage(defaults to 100%) properties.
Currently ImageSharp has implementations for the following blending modes:
Normal
Multiply
Add
Substract
Screen
Darken
Lighten
Overlay
HardLight
Src
Atop
Over
In
Out
Dest
DestAtop
DestOver
DestIn
DestOut
Clear
Xor
I'm importing my stimulus from a folder. I would like to make them bigger *the actual image size is 120 pix (height) x 170 pix (width). I've tried to double the size by using this code in the PsychoPy Coder:
stimuli.append(visual.ImageStim(win=win, name='image', units='cm', size= [9, 6.3],
(I used the double number in cms) but this distorts the image. Is it any way to enlarge it without it distorting, or do I have to change the stimuli itself?
Thank you
Just to answer what Michael said in the comment: no, if you scale an image up, the only way of guessing what is in between pixels is interpolation. This is what psychopy does and what ANY software would do. To make an analogy: take a picture of a distant tree using your digital camera. Then scale the image up using all kinds of software. You won't suddenly be able to see the individual leaves since the software had no such information as input.
If you need higher resolution, put higher resolution images in your folder. If it's simple shapes, you may use built-in methods such as visual.ShapeStim and it's variants: visual.Polygon, visual.Rect and visual.Circle. Psychopy can scale these shapes freely so they always stay sharp.
I'm searching for a methods of text recognition based on document borders.
Or the methods that can solve the problem of finding new viewpoint.
For exmp. the camera is in point (x1,y1,z1) and the result picture with perspective distortions, but we can find (x2,y2,z2) for camera to correct picture.
Thanks.
The usual approach, which assumes that the document's page is approximately flat in 3D space, is to warp the quadrangle encompassing the page into a rectangle. To do so you must estimate a homography, i.e. a (linear) projective transformation between the original image and its warped counterpart.
The estimation requires matching points (or lines) between the two images, and a common choice for documents is to map the page corners in the original images to the image corners of the warped image. This will in general produce a rectangle with an incorrect aspect ratio (i.e. the warped page will look "wider" or "taller" than the real one), but this can be easily corrected if you happen to know in advance what the real aspect ratio is (for example, because you know the type of paper used, whether letter, A4, etc.).
A simple algorithm to perform the estimation is the so-called Direct Linear Transformation.
The OpenCV library contains routines to help accomplishing all these tasks, look into it.
I am trying to extract the images stored in PDF as stream. While I can do this easily, I am not able to get the accurate image rotation information. I am looking for specific information such as MediaBox, Rotate and landscape/portrait mode.
When I extract the image, its alignment does not match the what the end user sees with a pdf reader tool.
I binary compared two PDFs (where an image was rotated 90 in the former and the same image was rotated 270 in the latter) and I found difference in a particular stream object. However, I am not able to make out what that stream information is.
Here are the two documents I am talking about:
http://bit.ly/eQZGKJ
http://bit.ly/g43Whb
The position, size and orientation of the image when displayed on the page is determined by the current transformation matrix (CTM). You have to execute the entire page content stream to determine the CTM that is in place when the image is displayed. It's like a virtual rendering of the PDF page.
To almost every image is so called CTM (current transformation matrix) stored. It gives a reader information about position, rotation and skewing of the image.
Check cm operator, which described in pdf reference as "Modify the current transformation matrix (CTM) by concatenating the specified matrix (see Section 4.2.1, “Coordinate Spaces”). Although the operands specify a matrix, they are written as six separate numbers, not as an array." In your PDF documents:
rotated1.pdf contains "0 550.08 -743.04 0 743.04 0 cm"
rotated2.pdf contains "0 -550.08 743.04 0 0 550.08 cm"
So we can say that your image rotates on 90deg clockwise or onto 90deg in opposite direction.
(and translated)
It can also have a clip so you may only see part of the image. MediaBox and rotation relate to the whole page.