TrueType Font's glyph are made of quadratic Bezier. Why do more than one consecutive off-curve points appear in glyph outline? - truetype

I'm writing a TTF parser. For a better understanding of the TTF format, I used TTX to extract the ".notdef" glyph data of C:\Windows\calibri.ttf as follow.
<TTGlyph name=".notdef" xMin="0" yMin="-397" xMax="978" yMax="1294">
<contour>
<pt x="978" y="1294" on="1"/>
<pt x="978" y="0" on="1"/>
<pt x="44" y="0" on="1"/>
<pt x="44" y="1294" on="1"/>
</contour>
<contour>
<pt x="891" y="81" on="1"/>
<pt x="891" y="1213" on="1"/>
<pt x="129" y="1213" on="1"/>
<pt x="129" y="81" on="1"/>
</contour>
<contour>
<pt x="767" y="855" on="1"/>
<pt x="767" y="796" on="0"/>
<pt x="732" y="704" on="0"/>
<pt x="669" y="641" on="0"/>
<pt x="583" y="605" on="0"/>
<pt x="532" y="602" on="1"/>
<pt x="527" y="450" on="1"/>
many more points
</contour>
...some other xml
</TTGlyph>
You can see more than one off-curve control points in a row. But I've learned that TrueType Font are made of Quadratic Beziers, each of which has two on-curve points (end points) and only one off-curve point (control point). How to interpret these consecutive off-curve points?

TTF parsing requires applying http://www.microsoft.com/typography/otspec/glyf.htm as well as the tech docs about the TTF format from the microsoft site. These tell us that there are two types of points for a curve: on-curve and off-curve points. on-curve points are "real" points, through which a curve passes, and off-curve points are control points that guide the bezier curvature.
Now, what you describe as "a bezier curve" is correct: a single (quadratic) bezier curve goes from 1 real point, guided by 1 control point, to 1 real point (higher order curves like cubics, quartics, etc. have more control points between the real points). However, quadratic curves are generally terrible for design work because they are really bad at approximating circular arcs, but are cheaper to work with than higher order curves, so we're stuck with them for fonts that use TrueType for glyph outlines. To get around the downside of quadratic curves, TrueType outlines generally use sequences of bezier curves rather than single curves in order to get decent-looking uniform curves, and those sequences tend to have a nice property: the on- and off-curve points are spaced in a way that we don't need to record every point in the sequence.
Consider this Bezier sequence:
P1 - C1 - P2 - C2 - P3 - C3 - P4
If we add the on information, we'd encode it in TTF as:
P1 - C1 - P2 - C2 - P3 - C3 - P4
1 - 0 - 1 - 0 - 1 - 0 - 1
Now for the trick: if each Pn is an on-curve point, and each Cn is a control point, and P2 lies exactly mid-way between C1 and C2, P3 lies exactly mid-way between C2 and C3, and so on, then this curve representation can be compacted a lot, because if we know C1 and C2, we know P2, etc. We don't have to list any of the mid-way points explicitly, we can just leave that up to whatever parses the glyph outline.
So TTF allows you to encode long bezier sequences with the above property as:
P1 - C1 - C2 - C3 - P4
1 - 0 - 0 - 0 - 1
As you can see: we're saving considerable space, without loss of precision. If you look at your TTX dump, you'll see this reflected in the on values for each point. To get the P2, P3, etc, all we do is this:
def getPoints(glyph):
points = []
previous_point = None;
flags = glyph.flags
for (i, point) in enumerate(glyph.point_array):
(mask_for_point, mask_for_previous_point) = flags[i]
# do we have an implied on-curve point?
if (previous_point && mask_for_point == 0 && mask_for_previous_point == 0):
missing_point = midpoint(point, previous_point)
points.push(missing_point)
# add the explicitly encoded point
points.push(point)
previous_point = point
return points
After running this procedure, the points array will have alternating on-curve and off-curve points, and the beziers are constructed as:
for i in range(0, len(array), 2):
curve(array[i], array[i+1], array[i+2])
edit after a bit of searching, http://chanae.walon.org/pub/ttf/ttf_glyphs.htm covers how to work with the glyf table data in pretty good detail (the ascii graphics are a bit silly, but still legible enough)
further edit after several years I managed to find documentation that actually explains (or, at least implies) it in the Apple documentation on TTF, over on https://developer.apple.com/fonts/TrueType-Reference-Manual/RM01/Chap1.html#necessary, which in "Figure 13" states that:
In particular the on-curve points, located at the midpoints of the tangents to the curve, add no extra information and might have been omitted.
even further edit ShreevatsaR points out that the text between Figures 2 and 3 in the apple documentation is also relevant:
It would also be possible to specify the curve shown in FIGURE 2 with one fewer point by removing point p2. Point p2 is not strictly needed to define the curve because its existence implied and its location can be reconstructed from the data given by the other points. After renumbering the remaining points, we have [FIGURE 3].

Related

Simplify high-order Bezier curve

I have an array of control points that represent a Bezier curve. It could be a fifth-order or 100-order Bezier curve, or anything in-between. I am looking for a way to simplify that Bezier curve into multiple cubic Bezier curves. An illustration below shows how tenth-degree curve can be simplified to three-degree curve, but I want to go further and simplify it to several cubic Bezier curves to achieve better approximation.
Code example would be very helpful.
As mohsenmadi already pointed out: in general this is not a thing you can do without coming up with your own error metric. Another idea is to go "well let's just approximate this curve as a sequence of lower order curves", so that we get something that looks better, and doesn't really require error metrics. This is a bit like "flattening" the curve to lines, except instead of lines we're going to use cubic Bezier segments instead, which gives nice looking curves, while keeping everything "tractable" as far as modern graphics libraries are concerned.
Then what we can do is: split up that "100th order curve" into a sequence of cubic Beziers by sampling the curve at regular intervals and then running those points through a Catmull-Rom algorithm. The procedure's pretty simple:
Pick some regularly spaced values for t, like 0, 0.2, 0.4, 0.6, 0.8 and 1, then
create the set of points tvalues.map(t => getCoordinate(curve, t)). Then,
build a virtual start and end point: forming a point 0 by starting at point 1 and moving back along its tangent, and forminga point n+1 by starting at n and following its tangent. We do this, because:
build the poly-Catmull-Rom, starting at virtual point 0 and ending at virtual point n+1.
Let's do this in pictures. Let's start with an 11th order Bezier curve:
And then let's just sample that at regular intervals:
We invent a 0th and n+1st point:
And then we run the Catmull-Rom procedure:
i = 0
e = points.length-4
curves = []
do {
crset = points.subset(i, 4)
curves.push(formCRCurve(crset))
} while(i++<e)
What does formCRCurve do? Good question:
formCRCurve(points: p1, p2, p3, p4):
d_start = vector(p2.x - p1.x, p2.y - p1.y)
d_end = vector(p4.x - p3.x, p4.y - p3.y)
return Curve(p2, d_start, d_end, p3)
So we see why we need those virtual points: given four points, we can form a Catmull-Rom curve from points 2 to point 3, using the tangent information we get with a little help from points 1 and 4.
Of course, we actualy want Bezier curves, not Catmull-Rom curves, but because they're the same "kind" of curve, we can freely convert between the two, so:
i = 0
e = points.length-4
bcurves = []
do {
pointset = points.subset(i, 4)
bcurves.push(formBezierCurve(pointset))
} while(i++<e)
formBezierCurve(points: p1, p2, p3, p4):
return bezier(
p2,
p2 + (p3 - p1)/6
p3 - (p4 - p2)/6
p3
)
So a Catmull-Rom curve based on points {p1,p2,p3,p4}, which passes through points p2 and p3, can be written as an equivalent Bezier curve that uses the start/control1/control2/end coodinates p2, p2 + (p3 - p1)/6, p3 - (p4 - p2)/6, and p3.
First, you have to know that there are no approximating lower degree curves that would do you justice! You are bound to introduce errors no escape. The questions then is: how to approximate such that the original and resultant curves are visually similar?
Assume your original curve is of degree n. First, subdivide it. You can subdivide a curve as many times as you want without introducing any errors. Here, the degree of each subdivisions is still n, but the geometric complexity and rate of curvature are reduced considerably. Second, you reduce the degree of each subdivision which is by now a simple shape with no high curvature that would introduce approximation errors.

Not a knapsack or bin algorithm

I need to find a combination of rectangles that will maximize the use of the area of a circle. The difference between my situation and the classic problems is I have a set of rectangles I CAN use, and a subset of those rectangles I MUST use.
By way of an analogy: Think of an end of a log and list of board sizes. I can cut 2x4s, 2x6s and 2x8s and 2x10 from a log but I must cut at least two 2x4s and one 2x8.
As I understand it, my particular variation is mildly different than other packing optimizations. Thanks in advance for any insight on how I might adapt existing algorithms to solve this problem.
NCDiesel
This is actually a pretty hard problem, even with squares instead of rectangles.
Here's an idea. Approach it as an knapsack-Integer-Program, which can give you some insights into the solution. (By definition it won't give you the optimal solution.)
IP Formulation Heuristic
Say you have a total of n rectangles, r1, r2, r3, ..., rn
Let the area of each rectangle be a1, a2, a3, ..., an
Let the area of the large circle you are given be *A*
Decision Variable
Xi = 1 if rectangle i is selected. 0 otherwise.
Objective
Minimize [A - Sum_over_i (ai * Xi)]
Subject to:
Sum_over_i (ai x Xi) <= A # Area_limit constraint
Xk = 1 for each rectangle k that has to be selected
You can solve this using any solver.
Now, the reason this is a heuristic is that this solution totally ignores the arrangement of the rectangles inside the circle. It also ends up "cutting" rectangles into smaller pieces to fit inside the circle. (That is why the Area_limit constraint is a weak bound.)
Relevant Reference
This Math SE question addresses the "classic" version of it.
And you can look at the link provided as comments in there, for several clever solutions involving squares of the same size packed inside a circle.

Resolution independent cubic bezier drawing on GPU (Blinn/Loop)

Based on the following resources, I have been trying to get resolution independent cubic bezier rendering on the GPU to work:
GPU Gems 3 Chapter 25
Curvy Blues
Resolution Independent Curve Rendering using Programmable Graphics Hardware
But as stated in the Curvy Blues website, there are errors in the documents on the other two websites. Curvy Blues tells me to look at the comments, but I don't seem to be able to find those comments. Another forum somewhere tells me the same, I don't remember what that forum was. But there is definitely something I am missing.
Anyway, I have tried to regenerate what is happening and I fail to understand the part where the discriminant is calculated from the determinants of a combination of transformed coordinates.
So I have the original coordinates, I stick them in a 4x4 matrix, transform this matrix with the M3-matrix and get the C-matrix.
Then I create 3x3 matrices from the coordinates in the C-matrix and calculate the determinants, which then can be combined to create the a, b and c of the quadratic equation that will help me find the roots.
Problem is, when I do it exactly like that: the discriminant is incorrect. I clearly put in coordinates for a serpentine (a symmetric one, but a correct serpentine), but it states it is a cusp.
When I calculate it myself using wxMaxima, deriving to 1st and 2nd order and then calculating the cross-product, simplifying to a quadratic equation, the discriminant of that equation seems to be correct when I put in the same coordinates.
When I force the code to use my own discriminant to determine if it's a serpentine or not, but I use the determinants to calculate the further k,l,m texture coordinates, the result is also incorrect.
So I presume there must be an error in the determinants.
Can anyone help me get this right?
I think I have managed to solve it. The results are near to perfect (sometimes inverted, but that's probably a different problem).
This is where I went wrong, and I hope I can help other people to not waste all the time I have wasted searching this.
I have based my code on the blinn-phong document.
I had coordinates b0, b1, b2, b3. I used to view them as 2D coordinates with a w, but I have changed this view, and this solved the problem. By viewing them as 3D coordinates with z = 0, and making them homogenous 4D coordinates for transformation (w = 1), the solution arrived.
By calculating the C matrix: C = M3 * B, I got these new coordinates.
When calculating the determinants d0, d1, d2, d3, I used to take the x, y coordinates from columns 0 and 1 in the C matrix, and the w factor from column 2. WRONG! When you think of it, the coordinates are actually 3D coordinates, so, for the w-factors, one should take column 3 and ignore column 2.
This gave me correct determinants, resulting in a discriminant that was able to sort out what type of curve I was handling.
But beware, what made my search even longer was the fact that I assumed that when it is visibly a serpentine, the result of the discriminant should always be > 0 (serpentine).
But this is not always the case, when you have a mathematically perfect sepentine (coordinates are so that the mean is exact middle), the determinant will say it's a cusp (determinant = 0). I used to think that this result was wrong, but it isn't. So don't be fooled by this.
The book GPU Gem 3 has a mistake here, and the page on nVidia's website has the mistake too:
a3 = b2 * (b1 x b1)
It's actually a3 = b2 * (b1 x b0).
There're other problems about this algorithm: the exp part of the floating point will overflow during the calculation, so you should be cautious and add normalize operations into your code.

Contours based on a "label mask"

I have images that have had features extracted with a contouring algorithm (I'm doing astrophysical source extraction). This approach yields a "feature map" that has each pixel "labeled" with an integer (usually ~1000 unique features per map).
I would like to show each individual feature as its own contour.
One way I could accomplish this is:
for ii in range(labelmask.max()):
contour(labelmask,levels=[ii-0.5])
However, this is very slow, particularly for large images. Is there a better (faster) way?
P.S.
A little testing showed that skimage's find-contours is no faster.
As per #tcaswell's comment, I need to explain why contour(labels, levels=np.unique(levels)+0.5)) or something similar doesn't work:
1. Matplotlib spaces each subsequent contour "inward" by a linewidth to avoid overlapping contour lines. This is not the behavior desired for a labelmask.
2. The lowest-level contours encompass the highest-level contours
3. As a result of the above, the highest-level contours will be surrounded by a miniature version of whatever colormap you're using and will have extra-thick contours compared to the lowest-level contours.
Sorry for answering my own... impatience (and good luck) got the better of me.
The key is to use matplotlib's low-level C routines:
I = imshow(data)
E = I.get_extent()
x,y = np.meshgrid(np.linspace(E[0],E[1],labels.shape[1]), np.linspace(E[2],E[3],labels.shape[0]))
for ii in np.unique(labels):
if ii == 0: continue
tracer = matplotlib._cntr.Cntr(x,y,labels*(labels==ii))
T = tracer.trace(0.5)
contour_xcoords,contour_ycoords = T[0].T
# to plot them:
plot(contour_xcoords, contour_ycoords)
Note that labels*(labels==ii) will put each label's contour at a slightly different location; change it to just labels==ii if you want overlapping contours between adjacent labels.

Placement of "good" control points in Bezier curves

I've been working on this problem for awhile now, and haven't been able to come up with a good solution thusfar.
The problem: I have an ordered list of three (or more) 2D points, and I want to stroke through these with a cubic Bezier curve, in such a way that it "looks good." The "looks good" part is pretty simple: I just want the wedge at the second point smoothed out (so, for example, the curve doesn't double-back on itself). So given three points, where should one place the two control points that would surround the second point in the triplet when drawing the curve.
My solution so far is as follows, but is incomplete. The idea might also help communicate the look that I'm after.
Given three points, (x1,y1), (x2,y2), (x3,y3). Take the circle inscribed by each triplet of points (if they are collinear, we just draw a straight line between them and move on). Take the line tangent to this circle at point (x2,y2) -- we will place the control points that surround (x2,y2) on this tangent line.
It's the last part that I'm stuck on. The problem I'm having is finding a way to place the two control points on this tangent line -- I have a good enough heuristic on how far from (x2,y2) on this line they should be, but of course, there are two points on this line that are that distance away. If we compute the one in the "wrong" direction, the curve loops around on itself.
To find the center of the circle described by the three points (if any of the points have the same x value, simply reorder the points in the calculation below):
double ma = (point2.y - point1.y) / (point2.x - point1.x);
double mb = (point3.y - point2.y) / (point3.x - point2.x);
CGPoint c; // Center of a circle passing through all three points.
c.x = (((ma * mb * (point1.y - point3.y)) + (mb * (point1.x + point2.x)) - (ma * (point2.x + point3.x))) / (2 * (mb - ma)));
c.y = (((-1 / ma) * (c.x - ((point1.x + point2.x) / 2))) + ((point1.y + point2.y) / 2));
Then, to find the points on the tangent line, in this case, finding the control point for the curve going from point2 to point3:
double d = ...; // distance we want the point. Based on the distance between
// point2 and point3.
// mc: Slope of the line perpendicular to the line between
// point2 and c.
double mc = - (c.x - point2.x) / (c.y - point2.y);
CGPoint tp; // point on the tangent line
double c = point2.y - mc * point2.x; // c == y intercept
tp.x = ???; // can't figure this out, the question is whether it should be
// less than point2.x, or greater than?
tp.y = mc * tp.x + c;
// then, compute a point cp that is distance d from point2 going in the direction
// of tp.
It sounds like you might need to figure out the direction the curve is going, in order to set the tangent points so that it won't double back on itself. From what I understand, it would be simply finding out the direction from (x1, y1) to (x2, y2), and then travelling on the tangent line your heuristic distance in the direction closest to the (x1, y1) -> (x2, y2) direction, and plopping the tangent point there.
If you're really confident that you have a good way of choosing how far along the tangent line your points should be, and you only need to decide which side to put each one on, then I would suggest that you look once again at that circle to which the line is tangent. You've got z1,z2,z3 in that order on the circle; imagine going around the circle from z2 towards z1, but go along the tangent line instead; that's which side the control point "before z2" should be; the control point "after z2" should be on the other side.
Note that this guarantees always to put the two control points on opposite sides of z2, which is important. (Also: you probably want them to be the same distance from z2, because otherwise you'll get a discontinuity at z2 in, er, the second derivative of your curve, which is likely to look a bit suboptimal.) I bet there will still be pathological cases.
If you don't mind a fair bit of code complexity, there's a sophisticated and very effective algorithm for exactly your problem (and more) in Don Knuth's METAFONT program (whose main purpose is drawing fonts). The algorithm is due to John Hobby. You can find a detailed explanation, and working code, in METAFONT or, perhaps better, the closely related METAPOST (which generates PostScript output instead of huge bitmaps).
Pointing you at it is a bit tricky, though, because METAFONT and METAPOST are "literate programs", which means that their source code and documentation consist of a kind of hybrid of Pascal code (for METAFONT) or C code (for METAPOST) and TeX markup. There are programs that will turn this into a beautifully typeset document, but so far as I know no one has put the result on the web anywhere. So here's a link to the source code, which you may or may not find entirely incomprehensible: http://foundry.supelec.fr/gf/project/metapost/scmsvn/?action=browse&path=%2Ftrunk%2Fsource%2Ftexk%2Fweb2c%2Fmplibdir%2Fmp.w&view=markup -- in which you should search for "Choosing control points".
(The beautifully-typeset document for METAFONT is available as a properly bound book under the title "METAFONT: the program". But it costs actual money, and the code is in Pascal.)