[MPEG-OTSPEC] comments wrt wide glyph ID proposal

Behdad Esfahbod behdad at behdad.org
Tue Dec 12 13:39:21 CET 2023


Thanks Peter for the excellent feedback.

Comments inline.

On Mon, Dec 11, 2023 at 7:54 PM Peter Constable <pconstable at microsoft.com>
wrote:

>
> *Hybrid narrow/wide fonts:*
>
> Hybrid fonts are going to be more challenging to build and maintain—much
> more so than hybrid COLRv0/v1. Attempting to engineer mechanisms
> specifically to accommodate hybrid fonts is likely to add to complexity.
>

I agree it should not be the focus of the work


> *TTCs:*
>
> A second take-away for us from thinking about hybrid fonts is that we
> think TTCs can provide another approach to creating hybrid fonts—one that
> could be easier for font developers to create and maintain. To that end, we
> think it would make sense to define a v2.1 TTC header that adds numFonts2
> and tableDirectoryOffsets2 members, and provide guidance that software that
> supports wide glyph IDs should use only these new members, ignoring
> numFonts and tableDirectoryOffsets. In this way, older software could see
> only fonts with narrow glyph IDs, while newer software could see a distinct
> set of fonts without duplication.
>

I'd be happy to incorporate this.


> This brought to my mind that, six – ten years ago (I forget the exact
> timeframe), there was discussion between Adobe, Apple and MS about defining
> a _*dmap*_ (delta cmap) table for use in TTCs: It’s very common in TTCs
> that there are cmap differences, with the result that each font in the TTC
> must have its own cmap without any sharing of data. In CJK fonts, the cmap
> table is one of the largest tables (probably second only to glyf or CFF /
> CFF2). Moreover, in a CJK font, the majority of mappings in a cmap table
> could be the same, with only a small portion of mappings being different.
> (E.g., in MS Gothic vs MS PGothic, all the ideograph glyphs are the same;
> it’s just Latins and punctuation that differ.) A dmap table would allow
> fonts in a TTC to share a common base cmap table with small, font-specific
> dmap tables handling differences. In our discussions, we came up with
> formats that would work, except we hadn’t figured out how to handle format
> 14 cmap subtables.
>

This reminds me of another idea we discussed in, I think, 2019, from
Monotype to introduce a `cmap` subtable that would map individual
characters to sequences of glyphs. Then the pre-composed Unicode characters
wouldn't need to have their own glyphs. Back then we dropped the idea for
backwards-compat reasons. But maybe we can pick it up now?



> *COLR, MATH:*
>
> We noted that the proposal doesn’t include any integration for COLR or
> MATH tables. There might be several things to consider in relation to the
> MATH table, and we have no concern with leaving that for future
> consideration.
>
>
>
> But COLR might not be too difficult. So, we think it’s worth discussing
> options:
>
>    1. Postpone for future consideration.
>    2. Create a new major version — i.e., a new table tag — to design a
>    table with wide glyph IDs (it wouldn’t need to support narrow IDs).
>    3. Create a minor version enhancement (COLR v2) that maintains
>    backward compatibility while adding wide support.
>
>
>
> The third option would need to add new offsets in the header for wide
> variants of base glyph and clip lists, with new BaseGlyphPaintRecord2 and
> ClipRecord2 formats. (There’d also need to be a new PaintGlyph format, but
> that will be true regardless.)
>
>
>
> We haven’t yet decided which option we prefer; we just want to get it into
> discussion.
>

My preference is to introduce PaintGlyph's with wide gid's without bumping
the format number for now, and postpone ClipList and other enhancements to
a future v2 version. Note that there exist already COLRv1 fonts that hit
the 64k glyph limit because of all the components. Those would become
feasible with just a new PaintGlyph2 / PaintColrGlyph2 / etc extension and
do not need the full ClipBox etc widening.


>
>
> *Max profile:*
>
> The current proposal doesn’t make any change wrt ‘maxp’, other than to say
> numGlyphs isn’t used for wide-GID support. In a hybrid font, it’s unclear
> what font developers should do with all the other maxp members: if they’re
> set as appropriate for narrow GIDs, then the values may not work for wide
> GIDs and the app could run out of resources. On the other hand, if the
> values are set for wide GIDs, those can work for both narrow and wide, but
> for older software could lead to over-allocation of unused resources.
>
>
>
> Since we’re already considering glyf/loca and GLYF/LOCA that can exist
> side by side, it seems simple and clean to define a MAXP table that gets
> used only in conjunction with GLYF/LOCA. These tables are small, so the
> file size impact is negligible.
>

How real is the use of max profile data these days? My understanding is
that since the data cannot be trusted anyway, software doesn't rely on it.



> *GPOS/GSUB:*
>
> It appears the proposal doesn’t yet include wide versions for common table
> formats that will be required (e.g., coverage). These will, of course, be
> needed
>

I'm surprised by that. But you are right, them seem missing from the PDF
document. @Liam Quin <liam at fromoldbooks.org>

The proposal is:

  https://github.com/harfbuzz/boring-expansion-spec/issues/30


> This may be an opportunity to deprecate certain formats from use in
> wide-GID fonts. E.g., GSUB type 5 and GPOS type 7 (contextual) were
> effectively obsoleted when the chaining contextual formats were added. If
> we agreed, then Contextual positioning / substitution subtable formats 4 –
> 6 wouldn’t need to be added.
>

I'm ambivalent here. Adding them is simple enough for me and keeps
consistency.


> Various formats are proposed using uint24 for subtable counts and Offset24
> for subtable offsets. This could turn into a real limitation. For example,
> consider single substitution format 4: if glyphCount were 5,592,406, then
> the size of the substituteGlyphIDs[] array would exceed xFFFFFF and
> Offset24 for coverageOffset would not work. We’re inclined to make offsets
> and any counts not limited by 24-bit GIDs to be 32-bit.
>

In those extreme cases the subtable can be broken down into more, like we
currently do with 16bit offsets. I don't think it's a realistic limitation,
but happy to bump all Coverage and ClassDef offsets to 32bit.

Thanks,

behdad
<https://lists.aau.at/mailman/listinfo/mpeg-otspec>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20231212/f292e28f/attachment.html>


More information about the mpeg-otspec mailing list