[MPEG-OTSPEC] [EXTERNAL] Re: Cmap format to map 1 char to multiple glyphs?

Skef Iterum skef at skef.org
Thu Dec 21 22:27:43 CET 2023


This is just an idea (and perhaps an ill-informed one if I'm not 
understanding the full problem correctly):

It seems to me that if these cmap proposals were to be adopted it would 
almost certainly be at the same time as the >64k GID extensions. The 
problem isn't running out of GIDs, there are plenty. Instead it's 
removing the overhead associated with those GIDs in various tables.

We've already discussed that there will be some way of determining the 
equivalent of the maxp GID count, going forward either explicit or based 
the length of a certain table. So suppose that we say that any GID 
higher than that (or, if that's too informal, GIDs higher than the value 
in some additional font-wide table field, or between the values in two 
such fields) has the following requirements:

  * It can only appear in CMAP and GSUB
  * Under any valid combination of active GSUB feature tags, the GID
    must be substituted into other GIDs less than the usual limit.

This way:

 1. No additional CMAP or GSUB "logic" is necessary for interpreting fonts
 2. The system already accommodates any relevant form of substitution
    available in GSUB, now or in the future

The main potential drawback I can see is that validating the GSUB 
requirement could be tricky, but I'm not sure that iron-clad validation 
is necessarily a requirement. There are lots of ways that fonts can have 
bugs that one can't necessarily rule out in advance.

Skef

On 12/21/23 10:00, Peter Constable wrote:
>
> It seems to me there’d at least be a compatibility boundary: newer 
> fonts with 1:m cmap mappings wouldn’t produce desired results in older 
> software unless the same effect were also implemented in GSUB lookups.
>
> Peter
>
> *From:*mpeg-otspec <mpeg-otspec-bounces at lists.aau.at> *On Behalf Of 
> *Behdad Esfahbod
> *Sent:* Tuesday, December 12, 2023 8:40 PM
> *To:* Ned Holbrook <ned at apple.com>
> *Cc:* mpeg-otspec at lists.aau.at
> *Subject:* [EXTERNAL] Re: [MPEG-OTSPEC] Cmap format to map 1 char to 
> multiple glyphs?
>
> Fair.  I'll do some measurements and report back if I find something 
> interesting.
>
>
> behdad
> http://behdad.org/
>
> On Tue, Dec 12, 2023 at 4:01 PM Ned Holbrook <ned at apple.com> wrote:
>
>     My main concern with producing multiple glyphs is that it has
>     substantial API and tooling implications.
>
>
>
>         On Dec 12, 2023, at 10:57 AM, Behdad Esfahbod
>         <behdad at behdad.org> wrote:
>
>         On Tue, Dec 12, 2023 at 11:51 AM John Hudson <john at tiro.ca> wrote:
>
>             I proposed that to the OT developer list a long while ago,
>             and recall that Kamal had a similar idea, initially in
>             terms of handling Unicode decompositions such that fonts
>             would not need precomposed diacritics. At the time,
>             Microsoft thought it unlikely to get traction, as it
>             implied significant engineering for unclear benefit, but
>             perhaps the benefit is clearer now? As you say, being able
>             to decompose a Unicode character to an arbitrary sequence
>             of glyphs is very useful for Arabic, and by-passes the
>             need to handle such decompositions in GSUB prior to other
>             shaping.
>
>             I suppose the question is whether there is a significant
>             benefit to doing this outside of GSUB? — or, indeed, if
>             there might be a reason it would be preferable in GSUB?
>
>             The inconsistency in dot handling in different joining
>             forms of some Arabic characters means that one doesn’t
>             always want to up-front decompose some characters to base
>             grapheme and combining dots, but those could be excluded
>             from the cmap and passed to GSUB form decomposition in the
>             joining form features. But that being the case, why not do
>             it all in GSUB?
>
>         Thanks John. The main benefit in my opinion is not allocating
>         a gid to every precomposed Unicode character, most of them
>         Latin. The Arabic use-case is extra.
>
>         b
>
>             JH
>
>             On 2023-12-12 9:04 am, Behdad Esfahbod wrote:
>
>                 Thank you everyone for the very productive meeting.
>
>                 I like to also bring this issue up. If there is
>                 interest, I can work on it. I wrote in my reply to
>                 Peter earlier:
>
>                 /This reminds me of another idea we discussed in, I
>                 think, 2019, from Monotype to introduce a `cmap`
>                 subtable that would map individual characters to
>                 sequences of glyphs. Then the pre-composed Unicode
>                 characters wouldn't need to have their own glyphs.
>                 Back then we dropped the idea for backwards-compat
>                 reasons. But maybe we can pick it up now?/
>
>                 This is very useful for Arabic as well...
>
>                 behdad
>                 http://behdad.org/
>
>                 _______________________________________________
>
>                 mpeg-otspec mailing list
>
>                 mpeg-otspec at lists.aau.at
>
>                 https://lists.aau.at/mailman/listinfo/mpeg-otspec
>
>             -- 
>
>             John Hudson
>
>             Tiro Typeworks Ltdwww.tiro.com  <http://www.tiro.com/>
>
>             Tiro Typeworks is physically located on islands
>
>             in the Salish Sea, on the traditional territory
>
>             of the Snuneymuxw and Penelakut First Nations.
>
>             __________
>
>             EMAIL HOUR
>
>             In the interests of productivity, I am only dealing
>
>             with email towards the end of the day, typically
>
>             between 4PM and 5PM. If you need to contact me more
>
>             urgently, please use other means.
>
>             _______________________________________________
>             mpeg-otspec mailing list
>             mpeg-otspec at lists.aau.at
>             https://lists.aau.at/mailman/listinfo/mpeg-otspec
>
>         _______________________________________________
>         mpeg-otspec mailing list
>         mpeg-otspec at lists.aau.at <mailto:mpeg-otspec at lists.aau.at>
>         https://lists.aau.at/mailman/listinfo/mpeg-otspec
>         <https://lists.aau.at/mailman/listinfo/mpeg-otspec>
>
>
> _______________________________________________
> mpeg-otspec mailing list
> mpeg-otspec at lists.aau.at
> https://lists.aau.at/mailman/listinfo/mpeg-otspec
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20231221/be6bf4d8/attachment-0001.html>


More information about the mpeg-otspec mailing list