[OpenType] RE: [mpeg-OTspec] Re: Languages tags (Buruchaski & North Slavey)

Peter Constable petercon at microsoft.com
Tue May 24 22:13:52 CEST 2016

One ISO 639 ID to multiple OT is certainly conceivable, and there are some OT tags going back a long while that fall in this category:

MAL, "Malayalam"
MLR, "Malayalam Reformed"

I'm pretty sure this was created with the intent of distinguishing "traditional" versus "reformed" Malayalam typography (the former using far more conjuncts than the latter). That's definitely an exceptional case, however. But to allow for that possibility, I've always referred to _default_ mapping of language tags to OT language system tags.

But groupings of languages for purposes of typography does make more sense. The only problem is that it would be a non-trivial research effort to identify how 7000 languages (or however many of them have any literary tradition) should be organized into typographic groupings. It's been a lot easier for people to say, "I'm creating a font targeting language X, so I'll use a language system tag for X."

If we did have data for groupings and also for specific languages, then we'd need to document which languages are encompassed by each grouping. We'd want to assert that any specific language can belong to at most one grouping. Then the software implementation that would make sense would be to test the selected font for the specific OT tag, else test for the grouping tag, else use the default language system.


-----Original Message-----
From: John Hudson [mailto:john at tiro.ca] 
Sent: Tuesday, May 24, 2016 12:29 PM
To: Peter Constable <petercon at microsoft.com>; Denis Jacquerye <moyogo at gmail.com>; Levantovsky, Vladimir <Vladimir.Levantovsky at monotype.com>; OTspec (mpeg-OTspec at yahoogroups.com) <mpeg-OTspec at yahoogroups.com>
Subject: Re: [OpenType] RE: [mpeg-OTspec] Re: Languages tags (Buruchaski & North Slavey)

On 24/05/16 11:56, Peter Constable wrote:

> With that in mind, SLA could be left as a higher-level category, and the language tag scs could be left as a mapping. But if we really want it to be a higher-level category, then the mapping should list both scs and xsl. Then we'd be left with this:
> Lang system tag: SLA
> Description: Slavey
> ISO 639-3 mappings: scs, xsl
> Lang system tag: SSL
> Description: South Slavey
> ISO 639-3 mappings: xsl
> That suggests a gap that could be filled, as John suggests:
> Lang system tag: SCS
> Description: North Slavey
> ISO 639-3 mappings: scs
> That would work and wouldn't conflict with anything. I wonder if it's overkill: is anybody going to implement fonts that have different glyphs or layout behaviour for SLA versus SSL versus SCS? But I won't object if that's how people want to proceed.

I think it is worthwhile to think about this as a more generalised case, rather than thinking about the Slavey languages per se. As Peter notes, OTL language system tags were badly under-documented; now they are less badly under-documented, but they're not even close to well-documented. 
As with other aspects of OTL, there is no standard implementation specification, or even recommendation of best practices.

Something like the structure Peter outlines, above, for Slavey languages, looks to me useful in a number of other situations, if it were spec'd what client software is supposed to do.

My take on this is that it would be useful to be able to specify language *group* systems (by which I mean typographic conventions associated with multiple languages, locales, etc.). So I wouldn't expect SLA to have distinct behaviour from SSL or SCS, but rather would provide a single tag for a single behaviour appropriate to both North and South Slavey. If a font maker wanted some distinct behaviour in SSL vs SCS, he or she would use those two tags, but otherwise would only need to use SLA.

In order for this to work, there needs to be a defined behaviour for software when mapping from ISO 639 language-tagged content to default OTL ls tags, such that the software would look first for a one-to-one mapping, e.g. scs to SCS for North Slavey, and then for a mapping as part of a set, e.g. scs,xsl to SLA for Slavey, and then fall back to dflt.

Am I foolish to assume there is never a situation where a single ISO code would map, one-to-one to multiple OTL tags? Any other way this could be blown up?



John Hudson
Tiro Typeworks Ltd    www.tiro.com
Salish Sea, BC        tiro at tiro.com

Getting Spiekermann to not like Helvetica is like training a cat to stay out of water. But I'm impressed that people know who to ask when they want to ask someone to not like Helvetica. That's progress. -- David Berlow

More information about the mpeg-otspec mailing list