[MPEG-OTSPEC] [EXTERNAL] Re: Cmap format to map 1 char to multiple glyphs?
Skef Iterum
skef at skef.org
Thu Dec 21 22:27:43 CET 2023
This is just an idea (and perhaps an ill-informed one if I'm not
understanding the full problem correctly):
It seems to me that if these cmap proposals were to be adopted it would
almost certainly be at the same time as the >64k GID extensions. The
problem isn't running out of GIDs, there are plenty. Instead it's
removing the overhead associated with those GIDs in various tables.
We've already discussed that there will be some way of determining the
equivalent of the maxp GID count, going forward either explicit or based
the length of a certain table. So suppose that we say that any GID
higher than that (or, if that's too informal, GIDs higher than the value
in some additional font-wide table field, or between the values in two
such fields) has the following requirements:
* It can only appear in CMAP and GSUB
* Under any valid combination of active GSUB feature tags, the GID
must be substituted into other GIDs less than the usual limit.
This way:
1. No additional CMAP or GSUB "logic" is necessary for interpreting fonts
2. The system already accommodates any relevant form of substitution
available in GSUB, now or in the future
The main potential drawback I can see is that validating the GSUB
requirement could be tricky, but I'm not sure that iron-clad validation
is necessarily a requirement. There are lots of ways that fonts can have
bugs that one can't necessarily rule out in advance.
Skef
On 12/21/23 10:00, Peter Constable wrote:
>
> It seems to me there’d at least be a compatibility boundary: newer
> fonts with 1:m cmap mappings wouldn’t produce desired results in older
> software unless the same effect were also implemented in GSUB lookups.
>
> Peter
>
> *From:*mpeg-otspec <mpeg-otspec-bounces at lists.aau.at> *On Behalf Of
> *Behdad Esfahbod
> *Sent:* Tuesday, December 12, 2023 8:40 PM
> *To:* Ned Holbrook <ned at apple.com>
> *Cc:* mpeg-otspec at lists.aau.at
> *Subject:* [EXTERNAL] Re: [MPEG-OTSPEC] Cmap format to map 1 char to
> multiple glyphs?
>
> Fair. I'll do some measurements and report back if I find something
> interesting.
>
>
> behdad
> http://behdad.org/
>
> On Tue, Dec 12, 2023 at 4:01 PM Ned Holbrook <ned at apple.com> wrote:
>
> My main concern with producing multiple glyphs is that it has
> substantial API and tooling implications.
>
>
>
> On Dec 12, 2023, at 10:57 AM, Behdad Esfahbod
> <behdad at behdad.org> wrote:
>
> On Tue, Dec 12, 2023 at 11:51 AM John Hudson <john at tiro.ca> wrote:
>
> I proposed that to the OT developer list a long while ago,
> and recall that Kamal had a similar idea, initially in
> terms of handling Unicode decompositions such that fonts
> would not need precomposed diacritics. At the time,
> Microsoft thought it unlikely to get traction, as it
> implied significant engineering for unclear benefit, but
> perhaps the benefit is clearer now? As you say, being able
> to decompose a Unicode character to an arbitrary sequence
> of glyphs is very useful for Arabic, and by-passes the
> need to handle such decompositions in GSUB prior to other
> shaping.
>
> I suppose the question is whether there is a significant
> benefit to doing this outside of GSUB? — or, indeed, if
> there might be a reason it would be preferable in GSUB?
>
> The inconsistency in dot handling in different joining
> forms of some Arabic characters means that one doesn’t
> always want to up-front decompose some characters to base
> grapheme and combining dots, but those could be excluded
> from the cmap and passed to GSUB form decomposition in the
> joining form features. But that being the case, why not do
> it all in GSUB?
>
> Thanks John. The main benefit in my opinion is not allocating
> a gid to every precomposed Unicode character, most of them
> Latin. The Arabic use-case is extra.
>
> b
>
> JH
>
> On 2023-12-12 9:04 am, Behdad Esfahbod wrote:
>
> Thank you everyone for the very productive meeting.
>
> I like to also bring this issue up. If there is
> interest, I can work on it. I wrote in my reply to
> Peter earlier:
>
> /This reminds me of another idea we discussed in, I
> think, 2019, from Monotype to introduce a `cmap`
> subtable that would map individual characters to
> sequences of glyphs. Then the pre-composed Unicode
> characters wouldn't need to have their own glyphs.
> Back then we dropped the idea for backwards-compat
> reasons. But maybe we can pick it up now?/
>
> This is very useful for Arabic as well...
>
> behdad
> http://behdad.org/
>
> _______________________________________________
>
> mpeg-otspec mailing list
>
> mpeg-otspec at lists.aau.at
>
> https://lists.aau.at/mailman/listinfo/mpeg-otspec
>
> --
>
> John Hudson
>
> Tiro Typeworks Ltdwww.tiro.com <http://www.tiro.com/>
>
> Tiro Typeworks is physically located on islands
>
> in the Salish Sea, on the traditional territory
>
> of the Snuneymuxw and Penelakut First Nations.
>
> __________
>
> EMAIL HOUR
>
> In the interests of productivity, I am only dealing
>
> with email towards the end of the day, typically
>
> between 4PM and 5PM. If you need to contact me more
>
> urgently, please use other means.
>
> _______________________________________________
> mpeg-otspec mailing list
> mpeg-otspec at lists.aau.at
> https://lists.aau.at/mailman/listinfo/mpeg-otspec
>
> _______________________________________________
> mpeg-otspec mailing list
> mpeg-otspec at lists.aau.at <mailto:mpeg-otspec at lists.aau.at>
> https://lists.aau.at/mailman/listinfo/mpeg-otspec
> <https://lists.aau.at/mailman/listinfo/mpeg-otspec>
>
>
> _______________________________________________
> mpeg-otspec mailing list
> mpeg-otspec at lists.aau.at
> https://lists.aau.at/mailman/listinfo/mpeg-otspec
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20231221/be6bf4d8/attachment-0001.html>
More information about the mpeg-otspec
mailing list