[mpeg-OTspec] New cmap format

Mon Apr 9 04:47:55 CEST 2012

Martin Hosken wrote:

> I agree, shaping engines should do the right thing here and
> decompose where a composed cmap entry is missing. In fact they
> should also compose decompose sequences where a composed cmap entry
> exists. So the issue cuts both ways. Another reason why just having
> a decomposing cmap isn't necessarily going to solve the problem for
> everyone.

Just a word of caution here that this is a somewhat tricky problem to
define at the Unicode level, because if you take "decompose" to mean
Unicode canonical decomposition (i.e. NFD, Normalization Form D) you
will end up mapping single-character CJK compatibility characters to
their unified codepoint.  For applications such as browsers which need
to support legacy encoding schemes such as JIS, this would effectively
mean that if a font not supporting a compatibility codepoint but
supporting the "decomposed" codepoint was used, a glyph for the
unified codepoint would be used rather than falling back to a font
containing a glyph for the compatibility codepoint.  The correct
rendering of names in Japanese *requires* valid rendering of these
compatibility characters (i.e. not replacing them with a glyph for the
unified codepoint).

It's unfortunate that this issue complicates the goal of being able
to support more space-efficient fonts, ones for which precomposed forms
can be omitted.

Cheers,

John Daggett
Mozilla Japan