[MPEG-OTSPEC] Shared GSUB/GPOS notes, was Re: dmap proposal

Skef Iterum skef at skef.org
Thu Dec 28 06:42:58 CET 2023


This all makes sense, but is what I was getting at in my earlier message 
when I said (as one horn of alternatives) "there's some token 
("dialect"?) that Unicode should be tracking and formalizing but isn't". 
If what we need to track is specific enough to point a user at the right 
font, it should be specific enough to assign a token to to use as a 
langsys, or some successor of a langsys. It seems better to me to try to 
get that worked out and up to date than to just let the current system 
rot relative to actual usage.

Is the current system so inflexible (in terms of "registry" or whatever) 
that it's not possible to get some new tags allocated to match the 
regions we would be building ttc-type fonts for?

As far as multiple options go, that sounds fine to me as long as a good 
faith and ongoing effort is being made to make the different options 
viable. Whereas it sounds a little like dmap is a bit of a "here's a 
hack so we can just not worry about that other stuff" sort of thing.

Skef

On 12/27/23 17:49, John Hudson wrote:
> On 2023-12-27 1:56 pm, Skef Iterum wrote:
>>
>> If I understand you right, things have gone against the 
>> script/language mechanism over the past decades on the (broadly 
>> speaking) client side. So the responsible thing to do now would be to 
>> deprecate that mechanism in the spec and recommend that future fonts 
>> do all substitutions and positioning in the context of DFLT dflt. 
>> This will save foundries a lot of effort and heartache.
>>
> The /script/ system in OTL is mostly fine, since its implementation is 
> mostly derived from Unicode script properties. The only shaky part of 
> that infrastructure is the lack of a standardised algorithm for script 
> itemisation and glyph run segmentation, which can lead to inconsistent 
> results for script=Common characters in different shaping engines.
>
> I always found the DFLT script concept confusing and uninviting—except 
> possibly for PUA—, and I don’t agree that it would ‘save foundries a 
> lot of effort and heartache’; rather, it would push font makers into 
> the AAT-like realm of trying to implement all shaping behaviour—even 
> standard behaviour derivable from character properties, such as Indic 
> reordering—within GSUB and GPOS. Again: the /script/ shaping aspect of 
> OTL is mostly pretty reliable and robust: it could just do with a bit 
> better standardisation of upfront itemisation and segmentation.
>
> It is the /langsys/ aspect that has proven to be unreliable and 
> fragile, and while Simon is partly right when he says that this is a 
> vendor implementation failure rather than a font format failure, I 
> think he is also partly wrong, because there are conceptual problems 
> in langsys that contribute to those implementation failures along 
> with, of course, /the absence of an implementation specification./ As 
> originally conceived by Eilyezer, a registered langsys tag represented 
> something like a ‘set of typographic conventions that might be shared 
> by multiple fonts and that /might/ be associated with a particular 
> language’.
>
> [One of my favourite examples of the distinction between langsys and 
> language was provided by Paul Nelson in the early days of registering 
> langsys tags: he pointed to differing conventions employed by French 
> and German classicists in their typography of Greek texts, and noted 
> that these could be captured in the script/langsys pairings grek/FRA 
> and grek/DEU.]
>
> That we are now talking about cmap vs GSUB in the context of ‘the 
> language/region problem’ illustrates the conceptual problem of langsys 
> in OpenType. Neither language nor region are reliably and 
> unambiguously captured in langsys, and hence mapping of langsys layout 
> behaviours in GSUB and GPOS to specific languages or regions are 
> more-or-less guessed at, or failed to be guessed at, in those vendor 
> applications to which Simon referred. So, for example, Adobe chose to 
> make OTL langsys GSUB ad GPOS accessible via spellchecking and 
> dictionary language settings, which is the sort of thing that appears 
> to work for a lot of languages, but does so by simply ignoring the 
> ways in which langsys was designed to be able to represent sets of 
> typographic conventions beyond language-specific forms or behaviours. 
> This means that there are registered langsys tags that are never going 
> to be accessible within Adobe’s implementation model, e.g. IPPH.
>
> Even if the implementation of langsys is limited in this way, to 
> hard-coded lists of langsys-to-language mappings, reliable application 
> of the langsys GSUB and GPOS relies on users or user agents setting 
> text language tags in documents, which is not something I have found 
> can be relied upon. Software could assist in this regard by 
> automatically identifying text language and applying appropriate 
> language tags, so perhaps failure to do so is the sort of thing Simon 
> has in mind. But there remain edge-cases, e.g. where text is to short 
> to be reliably identified, or where a user wants to invoke a 
> particular langsys behaviour—perhaps because it is /regionally/ 
> appropriate—for a language other than the one with which it is 
> associated by the software.
>
> From the preamble to the OTL langsys registry:
>
>     /What is meant by a “language system” in this context is a set of
>     typographic conventions for how text in a given script should be
>     presented. Such conventions may be associated with particular
>     languages, with particular genres of usage, with different
>     publications, and other such factors. For example, particular
>     glyph variants for certain characters may be required for
>     particular languages, or for phonetic transcription or
>     mathematical notation./
>
> Given the multivalency inherent in that definition of what is meant by 
> language system, it is difficult to see exactly /how/ software vendors 
> are meant to ‘correctly’ implement support. Personally, I think a 
> proper implementation is one that provides the user with a mechanism 
> to explicitly apply a particular OTL langsys to text, independent of 
> all other language or region tagging, i.e. to be able to invoke 
> particular GSUB and GPOS behaviour as grouped within a given font 
> under langsys tags in a way that overrides any algorithmic application 
> of the tags.
>
>> In contrast, a hinge point in GSUB/GPOS means that one can design a 
>> single unified font and just tie into the "initial" script/language 
>> using the overlapping GSUB trick (which could presumably be canned in 
>> a tool-set like fontTools) and TTC, addressing the messy present 
>> while not giving up on the better future. 
>
> There is a third option, of course, which is to provide both 
> mechanisms and let the font makers decide which to employ or, even, to 
> invent ways to combine them. In the same way what we can currently 
> make TTCs with separate cmap tables or with separate GSUB tables, or 
> with both, why not make it possible for us to use data-optimised dmap 
> or overlapping GSUB or both?
>
> JH
>
>
> PS. I rather like the idea of region langsys tags or language group 
> langsys tags, which would provide more efficient mechanisms in fonts 
> to address conventions across multiple languages, and to make 
> distinctions between e.g. Eastern and Western styles of Devanagari in 
> a single Sanskrit font.
>
>
> -- 
>
> John Hudson
> Tiro Typeworks Ltdwww.tiro.com
>
> Tiro Typeworks is physically located on islands
> in the Salish Sea, on the traditional territory
> of the Snuneymuxw and Penelakut First Nations.
>
> __________
>
> EMAIL HOUR
> In the interests of productivity, I am only dealing
> with email towards the end of the day, typically
> between 4PM and 5PM. If you need to contact me more
> urgently, please use other means.
>
> _______________________________________________
> mpeg-otspec mailing list
> mpeg-otspec at lists.aau.at
> https://lists.aau.at/mailman/listinfo/mpeg-otspec
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20231227/f6f4ce80/attachment-0001.html>


More information about the mpeg-otspec mailing list