PDF Text positioning - pdf

Considering following operator sequence:
Tf: R8 9.96
Tm: 0 1.00057 -1 0 105.12 60.3506
TJ: line 1:
Tf: R8 9.96
Tm: 0 1.00057 -1 0 105.12 95.9906
TJ: value 1
Tm: 0 1.00057 -1 0 116.16 60.3505
TJ: line 2:
Tf: R8 9.96
Tm: 0 1.00057 -1 0 116.16 124.551
TJ: value 2
Tm: 0 1.00057 -1 0 127.2 60.3507
TJ: line 3:
Tf: R8 9.96
Tm: 0 1.00057 -1 0 127.2 106.671
TJ: value 3
Tm: 0 1.00057 -1 0 138.24 60.3508
TJ: line 4:
Tf: R8 9.96
Tm: 0 1.00057 -1 0 138.24 112.791
TJ: value 4
PDF displays it as:
line 1: value 1
line 2: value 2
line 3: value 3
line 4: value 4
Referencing to PDF documentation matrix consist of [a b c d e f], where e = Tx and f = Ty
From first two command blocks (which gives first line of text) I noticed that Tx and Ty actually switched places. 105.12 stays same which should state vertical position.
PDF reference also says about rotation:
Rotations are produced by [ cos θ sin θ −sin θ cos θ 0 0 ], which has
the effect of rotating the coordinate system axes by an angle θ
counterclockwise.
Seems to be because of that Tx changes vertical position and Ty changes horizontal as sin(90) = 1 cos(0) = 0. Meaning 90 counterclockwise
Questionы:
Why increasing e (Tx) which considering rotation changes vertical position in actual PDF document lines go in correct order? According to Translation e (Tx) should descend.
Why letters and words are not rotated? Only e (Tx) and f (Ty) switched and that is all.

You only consider text matrix settings. You don't tell us about the current transformation matrix at the time of those text objects, and neither do you tell us about the page rotation value.
Considering your observations I would assume the page globally is rotated 90° clockwise.
This would explain why your 90° counterclockwise rotated text appears upright (your second question).
Furthermore with that page rotation the x axis would be vertical with coordinate values rising downwards answering your first question.
Some references
Rotate - integer - (Optional; inheritable) The number of degrees by which the page
shall be rotated clockwise when displayed or printed. The value
shall be a multiple of 90. Default value: 0.
(Table 30 – Entries in a page object - ISO 32000-1)
CTM - array - The current transformation matrix, which maps positions from
user coordinates to device coordinates (see 8.3, "Coordinate
Systems"). This matrix is modified by each application of the
coordinate transformation operator, cm. Initial value: a matrix
that transforms default user coordinates to device
coordinates.
(Table 52 – Device-Independent Graphics State Parameters - ISO 32000-1)

Related

How to calculate viewing BBox of an image in pdf?

I am trying to calculate the Showing image coordinates. But the actual image is showing bigger than showing below(fig1). But we can able to see only part of the image only. I want to calculate how the matrices are transforming(Calculation for shown image coords).
fig1
The content stream looks like below
The coords I am getting when I multiplied first q cm with second q cm is
[-122.196, 356.535, 484.061, 759.372]
But these are full image coord. How the 're' will change the calculation for part of image?
File
original pdf
After removing the 're' and 'W*'
Need Another answer on the same scenario.
second file
What I tried
0.24 0 0
0 -0.24 0
0 850 1
the 're' calculate to above CTM and it will gives
[595.92 0 0 7.05]
the nex cm instruction become CTM and looks like
1 0 0
0 1 0
262 404 1
the resulted matrix will be what How can I calculate it?
For the sake of brevity I'm going to round the numbers a bit here.
Case 1
Let's simply analyze your content stream excerpt:
Let's assume that the preceding instructions left the user space coordinate system and the clip path in its default state, so we can assume an identity current transformation matrix (CTM) and a clip path encompassing the whole page.
The first instruction
.24 0 0 -.24 0 850 cm
then changes the CTM to
0.24 0 0
0 -0.24 0
0 850 1
Thus, the rectangle path defined and used as a clip path thereafter
169 349.49 1038.37 1670.15 re
has, in the default user space, the coordinates (lower left, upper right):
[40.56 365.29 289.77 766.12]
Then the next cm instruction
2517.74 0 0 -1670.15 -504.99 2019.64 cm
changes the CTM to
604.26 0 0
0 400.84 0
-121.2 365.29 1
So the following bitmap image
/Im0 Do
is drawn, in the default user space, at the coordinates (lower left, upper right):
[-121.2 365.29 483.06 766.13]
This area partially is outside the clip path, so we get the visible image area in user space coordinates by intersecting those coordinates with the clip path, resulting in
[40.56 365.29 289.77 766.12]
So these are the coordinates you appear to be looking for.
Beware, in general clip paths can have arbitrary forms, and the CTM at image drawing time may not only scale, mirror, or translate (resulting in a rectangle parallel to the axis) but also rotate or skew (resulting in a rhomboid or something not parallel to the axis). Thus, calculating the intersection and making sense out of the result in general is more complicated.
In a comment you ask
But still I need bit more clarity to, after 're' how you got [40.56 365.29 289.77 766.12] this. How the calculation is happening.
I got those by applying the CTM to two diagonally opposed corners of the rectangle.
To get two such corners of
169 349.49 1038.37 1670.15 re
I first took the anchor point at 169 349.49 and as second point the anchor point with width and height added 1207.37 2019.64.
Then I applied the CTM to those two points
0.24 0 0
[169 349.49 1] × 0 -0.24 0 = [40.56 766.12 1]
0 850 1
0.24 0 0
[1207.37 2019.64 1] × 0 -0.24 0 = [289.77 365.29 1]
0 850 1
So I get the transformed corners at 40.56 766.12 and 289.77 365.29.
Due to the mirroring the resulting points were not lower-left to upper-right but instead upper-left to lower-right. Thus, I normalized the rectangle to [40.56 365.29 289.77 766.12].
Beware, this calculation makes use of the fact that the CTM only scaled, mirrored, and translated. If it also rotated or skewed, I would have had to apply the CTM to all corners of the rectangle (or at least three of them) and then worked with the rhomboid spanned by them.
Case 2
In an edit to your question you added another case:
This example shows that one has to inspect the XObject in question first.
If one assumed that Fm0 was an image XObject, the image would be drawn in a .24×.24 default user space units square, a tiny dot.
But Fm0 is not an image XObject, instead it is a form XObject which in turn shows an image XObject from its own resources. Thus, here is another step in the calculations:
The first instruction
.24 0 0 -.24 0 850 cm
then changes the CTM to
0.24 0 0
0 -0.24 0
0 850 1
Thus, the rectangle path defined and used as a clip path thereafter
0 0 2483.33 3512.32 re
has, in the default user space, the coordinates (lower left, upper right):
[0 7.04 596 850]
Then the next cm instruction
1 0 0 1 262 404 cm
changes the CTM to
0.24 0 0
0 -0.24 0
62.88 750.04 1
Due to
/Fm0 Do
we then have to continue with the XObject Fm0. First of all it has a bounding box entry
[ 0 0 1959 1306 ]
Applying the CTM to this we get a bounding box in the default user space of
[62.88 436.6 533.04 750.04]
which has to be intersected with the clip path.
The relevant content of the Fm0 is
0.72 196.505 1957.892 913.266 re
W* n
q
/GS0 gs
1957.8926 0 0 -1304.21912 0.7203979 1305.13342 cm
/Im0 Do
Q
Thus, the rectangle path defined and intersected with the clip path thereafter
0.72 196.51 1957.89 913.27 re
has, in the default user space, the coordinates (lower left, upper right):
[63.05 483.69 532.95 702.88]
Then the next cm instruction
1957.89 0 0 -1304.22 0.72 1305.13 cm
changes the CTM to
469.89 0 0
0 313.01 0
63.05 436.81 1
So the following bitmap image
/Im0 Do
is drawn, in the default user space, at the coordinates (lower left, upper right):
[63.05 436.81 532.94 749.82]
The effective clip path at this time is the intersection of the rectangles
[0 7.04 596 850]
[62.88 436.6 533.04 750.04]
[63.05 483.69 532.95 702.88]
so it is the rectangle
[63.05 483.69 532.95 702.88]
Thus, the visible area of that drawn image is
[63.05 483.69 532.94 702.88]
(Well, I hope it is, but maybe I somewhere along the path erred in some calculation...)

Drawing boundary of shape entirely inside the shape in PDF

I am using Path Construction in PDF to draw a shape, say a rectangle. For example:
0 0 m 0 1 l 1 1 l 1 0 l 0 0 l B
But now, the line connecting (0,0) and (0,1) has (0,0) and (0,1) in the center. Therefore, the boundary "leaves" the rectangle by half of the line width.
Is there a parameter, so that the boundary is drawn entirely inside the rectangle?
This is just the normal behaviour of the line drawing operation.
The thickness of the line is spread equally to both sides of the line. So if you have a 10pt think line from (0,0) to (10,0) and use the butt cap line style, you will have a filled rectangular area with the corners (0,-5), (10,-5), (10,5), (0,5).
Have a look at this PDF file - you can see this effect in the second row, second column. The inner white lines and the outer black lines have the same start and end points.
So if you want to have everything inside that rectangle, either using a clip path like mkl said or calculate the necessary end points, taking the line width and line cap/join style into account.
As already mentioned in a comment, using a clip path the size of that rectangle is an option.
As your path only consists of the rectangle in question, you can do so very easily, simply add the clipping path operator W before the path painting operator B:
0 0 m 0 1 l 1 1 l 1 0 l 0 0 l W B
If you don't want to keep the clip path, enclose all this in save-state/restore-state
q
0 0 m 0 1 l 1 1 l 1 0 l 0 0 l W B
Q

TJ and Td offset difference

I have some text that I want to edit (justified text, really annoying), so I was wondering if this:
BT /FAAABA 10 Tf
1 0 0 -1 0 9.38000011 Tm
(Some) Tj
36.77199936 0 Td
(text) Tj
38.4280014 0 Td
(stuff) Tj
33.42799759 0 Td
...
is equivalent to this:
BT
/FAAABA 10 Tf
1 0 0 -1 0 9.38000011 Tm
[(Some)-36.77199936*1000(text)-38.4280014*1000(stuff)-33.42799759*1000] TJ
...
Assuming horizontal text we determined in my answer to your previous question that the horizontal displacement tx corresponding to a number Tj in a TJ array can be calculated as
tx = (−Tj / 1000) × Tfs × Th
where Tfs is the current font size and Th is the current horizontal scaling factor.
Thus, if you have a horizontal displacement tx and want to calculate the corresponding number Tj for a TJ array, you simply resolve the equation above to:
Tj = -1000 × tx / (Tfs × Th)
BUT this is not exactly the situation in your case because Td does not simply shift the text matrix by its parameters but instead shifts the text line matrix by them and sets the text matrix to the new text line matrix value:
tx ty
Td
Move to the start of the next line, offset from the start of the current line by (tx, ty). tx and ty shall denote numbers expressed in unscaled text space units. More precisely, this operator shall perform these assignments:
(ISO 32000-1, Table 108 – Text-positioning operators)
Thus, the tx parameter of Td is not the tx to put into the equation above but you instead have to subtract the width of the text drawn since the last setting of the text line matrix.
So to transform your example
BT /FAAABA 10 Tf
1 0 0 -1 0 9.38000011 Tm
(Some) Tj
36.77199936 0 Td
(text) Tj
38.4280014 0 Td
(stuff) Tj
33.42799759 0 Td
into a
BT
/FAAABA 10 Tf
1 0 0 -1 0 9.38000011 Tm
[(Some) NUM1 (text) NUM2 (stuff) NUM3] TJ
form, you calculate the numeric values NUM1, NUM2, and NUM3 like this:
NUM1 = -1000 × (36.77199936 - width("Some")) / (Tfs × Th)
NUM2 = -1000 × (38.4280014 - width("text")) / (Tfs × Th)
NUM3 = -1000 × (33.42799759 - width("stuff")) / (Tfs × Th)
When calculating the widths of those strings remember to take the font size, the character spacing, and the horizontal scaling into account!
And even then the two forms are not identical because the text line matrix at the end differs.

PDF Specification - Get Font Size in Points

I'm trying to write a PDF parser in C# but I've run into an issue where I'm unsure how to interpret the specification.
Unless otherwise specified user space in a PDF document is 1/72 of an inch (i.e. 1pt).
The scale provided by the Tf operator scales the font from the standard size (generally 1 unit of user space / 1pt) to the correct display size.
I have the following page content:
1 0 0 -1 0 792 cm
q
0 0 612 792 re
W* n
q
.75 0 0 .75 0 0 cm
1 1 1 RG 1 1 1 rg
/G0 gs
0 0 816 1056 re
f
0 0 816 1056 re
f
0 0 816 1056 re
f
Q
Q
q
0 0 612 791.25 re
W* n
q
.75 0 0 .75 0 0 cm
1 1 1 RG 1 1 1 rg
/G0 gs
0 0 816 1055 re
f
0 96 816 960 re
f
0 0 0 RG 0 0 0 rg
BT
/F0 21.33 Tf
1 0 0 -1 0 140 Tm
96 0 Td <0037> Tj
13.0280762 0 Td <004B> Tj
11.8616943 0 Td <004C> Tj
4.7384338 0 Td <0056> Tj
ET
BT
/F1 21.33 Tf
1 0 0 -1 0 140 Tm
136.292267 0 Td <0001> Tj
ET
...
I know that the font size in points of the 2 text operations defined in the sample is 16pt however the Tf operator is using a size of 21.33. In order to convert from this font size back to points I was intending to use the scale (y) of the cm operator making the point size:
21.33 * 0.75 = 15.9975
However I could find nothing in the PDF specification supporting this conversion and none of the libraries I checked (PDFBox, iTextSharp, Spire PDF) listed the font size as anything but 21.33.
Should I use the CTM (as defined by the cm operator) to scale the font size back to the correct scale or is this just pure chance?
The pdf file is here: https://github.com/UglyToad/PdfPig/blob/master/src/UglyToad.PdfPig.Tests/Integration/Documents/Single%20Page%20Simple%20-%20from%20google%20drive.pdf
First of all, your comparison with other text extractors is based on a misunderstanding:
none of the libraries I checked (PDFBox, iTextSharp, Spire PDF) listed the font size as anything but 21.33.
The "font size" parameter returned by all those libraries simply is the size argument of the Tf instruction, not the effective font size your observe in the final document which you are trying to determine. So your comparison with other libraries does not make sense.
Now, concerning your approach:
In order to convert from this font size back to points I was intending to use the scale (y) of the cm operator making the point size:
21.33 * 0.75 = 15.9975
While some libraries call it so, calling the fourth cm parameter "scale (y)" is misleading. E.g. in case of text rotated by 90° it usually is null while the graphic representation usually is not reduced to zero height.
Thus, merely using the "scale (y)" parameter does not work, you have to take the whole transformation into account.
Eventually let's discuss what you actually are after.
As long as the combined transformation matrix (current transformation matrix + text matrix + horizontal scaling) is orthogonal and text lines are following this orthogonality, the meaning of your notion of font size is fairly obvious.
But as soon as there is a shearing in that combined matrix, the meaning of "font size" is not obvious anymore.
You might mean the length of what an originally vertical line (one unit high) is transformed into.
You might mean the length of the projection of that transformed line onto a line at a right angle to the transformed font base line.
Or you might mean the length of the projection of that transformed line onto a line at a right angle to an observed base line.
The former two numbers are trivial to calculate using simple linear algebra. The third number may be more difficult because you have to determine the base line observed by humans in the resulting PDF. In case of innovative use of transformations this might be non-trivial

CATransform3D row/column order

I have a bit of confusion regarding the matrix row/column order of the CATransform3D struct. The struct defines a matrix like this:
[m11 m12 m13 m14]
[m21 m22 m23 m24]
[m31 m32 m33 m34]
[m41 m42 m43 m44]
At first, it would seem that the values define rows (so that [m11 m12 m13 m14] forms the first row), but when you create a translation matrix by (tx, ty, tz), the matrix will look like this:
[ 1 0 0 0]
[ 0 1 0 0]
[ 0 0 1 0]
[tx ty tz 1]
My confusion comes from the fact that this is not a valid translation matrix; multiplying it with a 4-elements column-vector will not translate the point.
My guess is that the CATransform3D struct stores the values in column-order, so that the values [m11 m12 m13 m14] form the first column (and not the first row).
Can anyone confirm?
Yes, CATransform3D is in column major order because that is how OpenGL(ES) wants it. Core Animation uses GL in the background for it's rendering. If you want proof, check out the man page for glMultMatrix:
PARAMETERS
m Points to 16 consecutive values that are used as the elements of a 4 x 4 column-major matrix.16 consecutive values that are used as the elements of a 4 x 4 column-major matrix.
This really should be more clear in the docs for CALayer.
Your initial interpretation was correct; CATransform3D does define the matrix below:
[m11 m12 m13 m14]
[m21 m22 m23 m24]
[m31 m32 m33 m34]
[m41 m42 m43 m44]
And yes, although it may be confusing (if you are used to pre-multiplying transform matrices), this yields the translation matrix:
[ 1 0 0 0]
[ 0 1 0 0]
[ 0 0 1 0]
[tx ty tz 1]
See Figure 1-8 in the "Core Animation Programming Guide".
This is a valid transformation matrix if you post-multiply your transform matrices, which is what Apple does in Core Animation (see Figure 1-7 in the same guide, although beware the equation is missing transpose operations).