Composite Font Requirements (was RE: [mpeg-OTspec] AHG on Open Font Format
Ken Lunde
lunde at adobe.com
Mon Mar 16 21:42:11 CET 2009
All,
I am still digesting the recent posts by Karsten, John, Jeff, and
Mikhail. I am also attending the Worldware Conference for the next
three days, turning me into a pumpkin for most of this week.
I did want to convey some thoughts before that happens, though...
One question that has been raised is about encoding and Unicode.
Historically, encoding has been the "glue" for legacy Composite Font
formats. Text is what applications use, and the encoded values are
used to eventually get to the glyphs. For the Composite Font format
that we're discussing, we need some type of glue, and Unicode serves
this purpose well, given its broad overall coverage of languages and
scripts. In many ways, Composite Fonts are merely convenience
mechanisms for interfacing what are logically multiple fonts, and
doing so as though they were a single font. This is how we are able to
effectively break the 64K glyph barrier.
I would also argue that once the encoded values, such as Unicode,
become GIDs, which are necessarily specific to each component font of
a Composite Font, you're no longer dealing with the Composite Font as
a whole, but rather you're dealing with the component font. And yes,
Composite Fonts will have limitations. To some extent, using Unicode
as the glue may be thought of as the source of the limitation, but at
least for today, it seems to be the best glue we have. Our original
Composite Font format used Shift-JIS encoding as the glue, so using
Unicode is a fairly huge step forward. In terms of limitations, it
will be very difficult for glyphs to interact across component fonts,
because GSUB features operate on GIDs, not character codes. This means
that if a developer wants certain glyphs to interact in a GSUB
feature, they should be in the same component font. Luckily, all of
the glyphs for each supported script are likely to be in a single
component font. The only possible exception is like to be the CJK
Unified Ideographs. If we consider ISO 10646 up through and including
Amendment 6, there are 74,382 such characters. This is obviously over
the 64K glyph barrier. Some CJK Unified Ideographs do interact via
GSUB features, such as the simplified, traditional, and variant forms.
A large number of them do not interact.
I very much like the idea of using plain XML for representing this
format. I also favor specifying flags or characteristics that may
trigger certain behavior. For example, we can define a "cross-
platform" flag that triggers requirements, such as Unicode encoding
for the component fonts, a flat structure (a Composite Font cannot be
used as a component font of another Composite Font), and perhaps even
font format (OpenType). If a Composite Font is not flagged as cross-
platform, the client is then responsible for handling the encoding,
any recursion, and the font formats. This will allow the format to
serve the needs of more users and developers.
About defining the Composite Font metrics, isn't the 'BASE' table
designed to serve this purpose? In other words, the necessary
information, or at least a good chunk of it, should be encapsulated in
this OpenType table. Of course, some fonts lack this table. (Thinking
out loud, the presence of this table could also be considered one of
the requirements when a Composite Font is flagged as "cross-
platform.") The ability to adjust this on a per-component font basis,
along with scaling, needs to be in the format.
Regards...
-- Ken
On 2009/03/13, at 5:00, karstenluecke wrote:
> I like that you reduce the Composite Font Format intention to the
> question, which issue is the format to address?
>
> As to 2.
> ["what are the defining metrics (e.g. max ascender, descender,
> leading) of the composite font and how closely do the components of
> a composite need to adhere to these metrics?"]
> I think there are two aspects:
> (a) Metrics that define ideal/recommended/automatic line-to-line
> distance.
> (a.1) Two columns of different-script texts do not necessarily need
> the same line-to-line distance. E.g. Latin--Arabic or Latin--Chinese/
> Japanese may even suffer from it. I am not sure if a composite font
> needs to impose "global" values here.
> (a.2) In case of text which includes single different-script words
> or phrases, the font that provides glyphs for the "primary" script
> text may determine the line-to-line distance, and the other script
> would follow. Here, "scale" factors as suggested in Mr Leonov's 4.
> may jump in.
> (b) Metrics that define maximum dimensions (OS/2.usWinAscent/
> Descent) should not have any impact on line-to-line distance anyway.
> If a composite font would provide these, they should be taken from
> the font with largest dimensions. There is no need to keep these
> values identical with every future composite font update or addition
> of other fonts to the composite font.
> But that would be an ideal world.
>
> Perhaps one more question which I cannot find addressed in the posts:
>
> 9.
> Do Unicode ranges (a) defined in a composite font refer to
> precomposed character-glyphs only or do they also (b) include
> characters not covered in the font/cmap as such but would result
> from Unicode composition rules + separate base/mark glyphs + ccmp/
> mark/mkmk?
> (b) would require that composite-font-savvy layout engines must,
> rather than may, support layout tables.
>
> Best wishes,
> Karsten Luecke
>
>
>
More information about the mpeg-otspec
mailing list