PDF Low-Level: Drawing a line in the content object? - pdf

I have searched extensively online and I have the PDF specification in which I have looked, yet I still can't figure out how to draw a simple black line on a PDF page from the content object's instructions (stream).
Let's say I just want to draw a 1-pixel thickness (assuming 72 dpi) black line at x 400, y 100-300.
This should in theory be a very simple operation, but the PDF spec goes on and on about all kinds of fancy things and appears to forget to explain how I would go about performing this simple operation.
Please can someone point me in the right direction?

In the PDF specification, have a look at chapter 8 (Graphics) and in there section 8.5, Path Construction and Painting.
To draw a simple straight path, you need a "move to" operation followed by a "line to" operation:
400 100 m
400 300 l
You can then stroke the path using the S operator so your code becomes
400 100 m
400 300 l
S
By default the color is black so you've already gotten a black line :-) But if you want to make sure you have to set some parameters in the graphics state.
0 G
1 w
400 100 m
400 300 l
S
The first line now sets the color space to "gray" and puts the shade of grey to 0 (black). The following line sets the line width of your stroked line to 1 user unit (what this comes out as is dependent on your current transformation matrix.
You can apply a neat trick if you really want 1 pixel (please don't for production files though!) and that is to set the width to zero:
0 w
This gives you "the thinnest line that can be rendered at device resolution: 1 device pixel wide".

Related

How to get a uniform line width in PDF regardless of the device space aspect ratio?

The width of a line in PDF is defined in terms of distances in the user space. In my use case, the aspect ratio of the device space (e.g. 4:3) is different from the aspect ratio of the user space (e.g. 1:1), which causes the line widths in the device space to be different in vertical and horizontal directions.
For example, in this picture the horizontal and vertical lines should be of the same width, but they're not:
I would like to perform scaling that only results in line width uniformity and does not affect anything else.
I asked a similar question regarding PostScript here: How to ensure line widths are the same vertically and horizontally in PostScript?. A solution based in part on the answer to this question works for PostScript, but does not work in PDF after what seems to be an almost one-to-one translation.
I tried changing the stroke command S to q 1 0 0 1.5 0 0 cm S Q h, where q saves the graphics state, 1 0 0 1.5 0 0 cm scales the current transformation matrix, Q restores the graphics state, and h closes the current subpath. However, in addition to correctly scaling the line widths, this also scales the y-coordinates of the line endpoints by 1.5.
This is what I need to get:
But with q 1 0 0 1.5 0 0 cm S Q h, I get this instead:
How to make the line width uniform in the device space in PDF without affecting anything else?

Clipping path seems to be outside of text

recently I wanted to construct a PDF document which should have text clipping: With 4 Tr I tried to define the text as clipping area. But when I wanted to fill the lower part of the text with red color, the result was reversed.
Do anyone knows, why?
Thanks for any answer!
stream
BT
4 8 Td
0.8 0.2 0.7 rg % Writing lila.
4 Tr % Fill & Use text as clipping area.
/TR 32 Tf
(Hallo Welt) Tj
1 0 0 rg % Fill in red.
0 0 200 20 re F % <- Mistake?
ET
What I wanted to have:
What I got:
Have a look at the specification ISO 32000-1:
The behaviour of the clipping modes requires further explanation. Glyph outlines shall begin accumulating if a BT operator is executed while the text rendering mode is set to a clipping mode or if it is set to a clipping mode within a text object. Glyphs shall accumulate until the text object is ended by an ET operator; the text rendering mode shall not be changed back to a nonclipping mode before that point.
(section 9.3.6 Text Rendering Mode )
In your sample you don't wait until the ET for the clipping path to take effect. So, when you are painting the red rectangle, your special clipping path is not yet in effect.
Furthermore your operation sequence actually is invalid! Neither path construction nor path painting operators (i.e. neither your 0 0 200 20 re nor your F) are allowed inside a text object, cf. Figure 9 – Graphics Objects in the specification:
Thus, strictly speaking your PDF viewer had better refuse to draw your content stream at all.

Create a tiff with only images and no text from a postscript file with ghostscript

Is it possible to create a tiff file from a postscript-file (created from a pdf-document with readable text and images) into a tiff file without the text and only the images?
There is a way to create a tiff with no images, but I don't know how to use that way for my task. I need it to generate two images from a postscript-file - the first one with the images only and the second one with the text only.
Since the text is drawn over the top of the image, simple clipping won't do the job.
You can hack the text out by redefining the show operators to no-ops. Insert this after the %%Page comment line (where the page code really starts).
/show{pop}def
/ashow{3{pop}repeat}def
/widthshow{4{pop}repeat}def
/awidthshow{6{pop}repeat}def
/kshow{2{pop}repeat}def
/xshow{2{pop}repeat}def
/xyshow{2{pop}repeat}def
/yshow{2{pop}repeat}def
/glyphshow{pop}def
/cshow{2{pop}repeat}def
This will suppress all text-drawing operators. Edit: Now includes level 2 and 3 operators.
If you're trying to selectively suppress different kinds of elements, you may want to redefine only some of these operators. You can add % at the beginning of a line to comment-out a line of the code, keeping the full list intact (for future uses).
Another way to selectively suppress elements from a ps file is to use the powerful clipping mechanism.
144 288 72 72 rectclip % clip to a 1"x1" square 2" from the left, 4" from the bottom
Clip works with an arbitrary path. So you can even string together points in a connect-the-dots fashion to get the effect of a lasso-clip. Probably easiest if you print the image out and trace a grid to easily plot points for the trajectory.
100 100 moveto
200 200 lineto
300 100 lineto
500 500 lineto
200 700 lineto
closepath clip % clip to a non-rectangular convex polygon
Clipping will suppress all drawing operations that fall outside of the clippath while the clipping path is in effect.

Text rotation in PDF

So I have this situation:
using pdftoxml.exe from sourceforge.net I got text tokens and their coordinates. If the pdf file was rotated (i.e. it has a /Rotate 90 written in its source) pdftoxml.exe swaps height and width of a given page and also x and y coordinates of any given object. That is what I understand.
I was happy with it, until I came across a pdf file which used re to draw thick lines. That is, for a thick line, 4 thin lines are drawn and the space is filled, like in this picture. On the left you see two thin lines (non colored), which are part of a bigger rectangle (highly zoomed in). I emptied the space inbetween which was actually filled with black, to see the lines:
Additionally, above pdf is rotated. So to get B upright in the end, this textmatrix was used: 0 1 -1 0 90.72 28.3705 Tm. The thin lines were drawn like this from 83.04 27.891 0.48 0.48 re (coordinates may vary here, but it was some re operation like that. The operation goes like x y width height re and re is for rectangle from adobe's pdf 1.7 page 133). What is relevant here is the calculation 27.891 + 0.48 = 28.371 which is not rounded or altered because of floating-point issues. It is the exact value for the line's x and unfortunately, it is bigger than the hard coded B's x which is 28.3705 :
83.52 27.891 m 92.39999999999999 27.891 l s
92.39999999999999 27.891 m 92.39999999999999 28.371 l s
92.39999999999999 28.371 m 83.52 28.371 l s
83.52 28.371 m 83.52 27.891 l s
The page's coordinates go like 842 x 595,2 according to PDFXChange viewer from upper left corner. Which seems natural since the page is rotated. Unrotated, it would be the lower left corner, so that ought to be ok.
When the text is altered with 1 0 0 1 90.72 28.3705 Tm into its original orientation, one can see the collapsing bottom line with the line on the left:
which is what I would expect, since B 's y is 28.3705 and and the line's horizontal position is 28.371 (as can be seen on the second line of above code lines). So probabyly B's bottom line falls beyond the 28.371 but I could not zoom that.
Now where does the gap between the line and the B come from in the first picture? This is important to me because I was trying to figure out which is the closest line on the left to B and was surprised by the two values, namely the suppsed x value of the text I get from pdftoxml.exe which is 28.3705 and the lines horizontal value 28.371. Since I knew the line is actually far beyond the left of the B that could not be correct, at least not in the sense of "take x position of line, take x position of B, compare, and if the line's x is less then than B's x, the line is on the left".
I can't locate the correct line with the x values. Instead I get the other line on the very left...like as if the text was falling inbetween them two.
This is the text drawing code:
BT
%0 7.5 -7.5 0 90.72 28.3705 Tm
0 1 -1 0 90.72 28.3705 Tm
%1 0 0 1 90.72 28.3705 Tm
/F1 1 Tf
1 Tr
q
0.01 w
(B) Tj
Q
ET
so, there is nothing fancy happening with the B's size or line thickness.
Can you help me figure out?
This is an updated picture with two I drawn on the same page, for the upper I using 0 1 -1 0 90.72 28.3705 Tm (rotated 90 degrees mathematically), for the lower one 1 0 0 1 90.72 28.3705 Tm. So I don't get it, how is the lower I rotated +90 and ends up being the upper one?
Here is the pdf code. It is rather big, but you should be able to copy it into your file and name it sth.pdf.
PDF Sample ( you have to actually zoom into the upper left corner real big to see the I )
EDIT
I actually found some interesting information about finding the glyph bounding box, but I could not yet put the pieces together.
Please have a look at
The glyph origin is the point (0, 0) in the glyph coordinate system. Tj and other text-showing operators shall position the origin of the first glyph to be painted at the origin of text space.
(shamelessly copied from Figure 39, section 9.2.4 of ISO 32000-1).
As you can see, the coordinates where the glyph is positioned, the glyph origin, is not necessarily where the actual glyph bounding box starts. This may explain the gap in your first image.
Thus, when you are trying to figure out which is the closest line on the left to B optically, it does not suffice to take x position of line, take x position of B, compare, and if the line's x is less then than B's x, the line is on the left, instead you also have to take the font data themselves into account and factor in the gap between glyph origin and glyph bounding box of the glyph represented by B.
For a more in-depth analysis please supply the font data.
EDIT concerning your double-I question... in your comment above you say you actually expected to see a common point - the rotation point - in both I characters, so you can get hands on a reliable horizontal coordinate for the left bounding box side of a character.
Isn't the point where the red lines cross, your rotation point? It should be the glyph origin for both Tj operations, and the I-glyphs have their origins there. Now you can measure from there on.

Is it possible to draw strokes for a path after restoring the graphics state in PDF?

I'm drawing lines in PDF that I want to scale in a ratio other than 1:1.
The problem is that i get strokes that looks like they been drawn with a caligraphic pen.
Is it possible somehow in PDF to resize the path, restore the graphics state and then draw the stroke of the previous path.
This is how I get caligraphical line strokes in PDF:
5 w // width of stroke
q // saves the current graphics state
0 1 0 0.2 0 0 cm // transformation matrix scaling with height reduced to 20%
0 10 m // Start of line
10 10 l // line to
20 100 l
30 100 l
40 10 l
S // draws stroke
Q // Restores graphics state
In HTML5 canvas it's possible to draw stroke after restoring the graphics state so that the path is drawn by a equally width line.
http://www.html5canvastutorials.com/advanced/html5-canvas-ovals/
In PDF putting S after Q doesn't work.
Is there some way to get the same result in PDF where only the line path gets scaled, not the stroke itself?
Have a look at Figure 9 - Graphics Objects - on page 113 of the PDF specification ISO 32000-1:2008. It illustrates that as soon as you have started constructing a path, the only allowed operators are those for path construction, path clipping, and path painting. Q being a special graphics state operator is only allowed after a path painting operator, e.g. your S.
This also is stated in the example right below the graphic:
The path construction operators m and re signal the beginning
of a path object. Inside the path object, additional path construction
operators are permitted, as are the clipping path operators W and
W*, but not general graphics state operators such as w or J.
A path-painting operator, such as S or f, ends the path object
and returns to the page description level.
Thus in response to "Is there some way to get the same result in PDF where only the line path get's scaled, not the stroke itself?": No, you have to explicitly select a smaller stroke width to compensate the different scale introduced by the transmation matrix.