[MPEG-OTSPEC] Cmap format to map 1 char to multiple glyphs?

Ned Holbrook ned at apple.com
Wed Dec 13 00:01:15 CET 2023


My main concern with producing multiple glyphs is that it has substantial API and tooling implications.

> On Dec 12, 2023, at 10:57 AM, Behdad Esfahbod <behdad at behdad.org> wrote:
> 
> On Tue, Dec 12, 2023 at 11:51 AM John Hudson <john at tiro.ca <mailto:john at tiro.ca>> wrote:
>> I proposed that to the OT developer list a long while ago, and recall that Kamal had a similar idea, initially in terms of handling Unicode decompositions such that fonts would not need precomposed diacritics. At the time, Microsoft thought it unlikely to get traction, as it implied significant engineering for unclear benefit, but perhaps the benefit is clearer now? As you say, being able to decompose a Unicode character to an arbitrary sequence of glyphs is very useful for Arabic, and by-passes the need to handle such decompositions in GSUB prior to other shaping. 
>> 
>> I suppose the question is whether there is a significant benefit to doing this outside of GSUB? — or, indeed, if there might be a reason it would be preferable in GSUB?
>> 
>> The inconsistency in dot handling in different joining forms of some Arabic characters means that one doesn’t always want to up-front decompose some characters to base grapheme and combining dots, but those could be excluded from the cmap and passed to GSUB form decomposition in the joining form features. But that being the case, why not do it all in GSUB?
>> 
> Thanks John. The main benefit in my opinion is not allocating a gid to every precomposed Unicode character, most of them Latin. The Arabic use-case is extra.
>  
> b
> 
> 
>> JH
>> 
>> 
>> 
>> On 2023-12-12 9:04 am, Behdad Esfahbod wrote:
>>> Thank you everyone for the very productive meeting.
>>> 
>>> I like to also bring this issue up. If there is interest, I can work on it. I wrote in my reply to Peter earlier:
>>> 
>>> This reminds me of another idea we discussed in, I think, 2019, from Monotype to introduce a `cmap` subtable that would map individual characters to sequences of glyphs. Then the pre-composed Unicode characters wouldn't need to have their own glyphs. Back then we dropped the idea for backwards-compat reasons. But maybe we can pick it up now?
>>> 
>>> This is very useful for Arabic as well...
>>> 
>>> behdad
>>> http://behdad.org/
>>> 
>>> 
>>> _______________________________________________
>>> mpeg-otspec mailing list
>>> mpeg-otspec at lists.aau.at <mailto:mpeg-otspec at lists.aau.at>
>>> https://lists.aau.at/mailman/listinfo/mpeg-otspec
>> -- 
>> 
>> John Hudson
>> Tiro Typeworks Ltd    www.tiro.com <http://www.tiro.com/>
>> 
>> Tiro Typeworks is physically located on islands 
>> in the Salish Sea, on the traditional territory 
>> of the Snuneymuxw and Penelakut First Nations.
>> 
>> __________
>> 
>> EMAIL HOUR
>> In the interests of productivity, I am only dealing 
>> with email towards the end of the day, typically 
>> between 4PM and 5PM. If you need to contact me more 
>> urgently, please use other means.
>> _______________________________________________
>> mpeg-otspec mailing list
>> mpeg-otspec at lists.aau.at <mailto:mpeg-otspec at lists.aau.at>
>> https://lists.aau.at/mailman/listinfo/mpeg-otspec
> _______________________________________________
> mpeg-otspec mailing list
> mpeg-otspec at lists.aau.at <mailto:mpeg-otspec at lists.aau.at>
> https://lists.aau.at/mailman/listinfo/mpeg-otspec

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20231212/b699787f/attachment.html>


More information about the mpeg-otspec mailing list