FW: [mpeg-OTspec] Re: [OpenType] MS Proposal for a new Name Table ID

Tue Jan 8 23:31:13 CET 2013

FYI

-----Original Message-----
From: listmaster at indx.co.uk [mailto:listmaster at indx.co.uk] On Behalf Of Adam Twardoch (List)
Sent: Tuesday, January 08, 2013 4:48 PM
To: "multiple recipients of OpenType"@mail.indx.co.uk
Subject: Re: [mpeg-OTspec] Re: [OpenType] MS Proposal for a new Name Table ID

Message from OpenType list:

On July 13, 2009, I wrote the following proposal to this very list list:

== Start quote ==

Subject: Empty OpenType features as a way for machine-readable font characterization

Imagine a font that has no lowercase forms, but instead, its lowercase characters actually have the form of smallcaps -- for example Chevalier, Bank Gothic or Adobe's Sava Pro and Trajan Pro.

I think that such a font might actually sport an empty "smcp" feature as well, one that has no lookups -- but which could act as a machine-readable hint to the application that that font has, in fact, smallcaps.

I'm thinking that an idea of using empty OT features as a way to indicate stylistic parameters of a font is not bad. For example, imagine an OpenType font that only has proportional oldstyle figures, and no other figures styles. Obviously, the proportional oldstyle figures would be encoded using the appropriate Unicode codepoints for figures -- but wouldn't it make sense to also include empty "onum" and "pnum" features so that some future applications could "know" what type of figures the font has by default?

Similarly, a Chinese font that only has traditional Chinese characters could include an empty "trad" feature, and a font that only has simplified Chinese characters could include an empty "smpl" feature.

A font that consists purely of ornaments or dingbats encoded as Latin uppercase or lowercase could include an empty "ornm" font. In a sense, those features could act as a supplemental mechanism to the Unicode range bits -- they could be a simple, machine-readable way to indicate stylistic characteristics of a font.

I mean, OpenType Layout features do in fact perform two functions: one is to actually perform a specific action (such as substitutions or positioning), and the other is simply to serve as a catalog for a font's typographic characteristics. Checking for mere presence of a certain OT feature is what applications could do to filter fonts and present to the user just those that fulfil specific criteria.

Whether those typographic characteristics are achieved by triggering a substitution or positioning, or if they perhaps "are there by default", is kind of secondary.

Best,
Adam

== End quote ==

It's not exactly what this discussion is all about, but is similar. That proposal of mine went largely uncommented (except one sneaky comment from Miguel Sousa), and I'm not sure whether that proposal of mine was any good. But I'm re-quoting it here because I think it addresses a
*related* issue, or -- if extended -- could actually be used as an alternative solution to this problem. Perhaps. After all, there is already a tradition in existing applications to parse the combination of OpenType Layout tags (for script, languagesystem and feature) within a font, and deduce some indications from it.

I agree with Jelle to the extent that the point of the Microsoft "name ID" proposal that we're currently discussing is certainly not to provide an *exhaustive* list of what a font supports, but instead, to provide a "complementary" mechanism when other kinds of heuristics fail to yield a conclusive result. So it's meant to be used rarely in special situations. This is very different from, let's say, the "cmap" table, which is meant to be exhaustive.

The concept of OT Layout tags sits somewhere in the middle, but it also is more "complementary" than exhaustive, i.e. you don't need a "cyrl"
script tag to be present in GSUB or GPOS for a font to render Cyrillic, or the "hanz" script tag for it to render Chinese. OT Layout tags are currently used only when there is a "special" need for them (e.g.
special layout rules for a certain script, or languagesystem).

So before we proceed any further, we need to make very clear to ourselves:

* Are we talking about an EXHAUSTIVE mechanism for indicating language support in a font (I believe that's not the case),
* or are we talking about a SUPPORTIVE mechanism which provides additional clarification when information from existing sources (cmap, OTL tags, OS/2 info) are not sufficient?

As for the technical implementation, I do echo John Jenkins' skepticism if the "name" table is the best source for this. There are several reasons for it:

1. The "name" table, to me, is somewhat volatile in nature. Many tools or workflows intervene into it, font vendors frequently perform customization of their fonts, and very often, the "name" table is the one that is changed. I think that significant functional aspects of the font, especially some core information about a font's "linguistic purposefulness" should not be placed into such a volatile table. In a way, the "name" table is the product's packaging.

It's the car's *body*. Every car needs a body, but the body can be repainted, reshaped, even scratched or bent -- the car still drives. The MS proposal looks a bit to me as if we were modifying the car's design so that, in future, if someone scratches on the car's paint or makes a small dent, the car may refuse to start the engine.

2. To me, the main purpose of the "name" table is to store human-readable data, not machine-readable data. Some aspects of the "name" table are of course used for machine-reading purposes (such as the Family name fields are used to unify fonts under a common menu name), but even those names are *human-readable*. Introducing purely machine-readable contents into the "name" table is to me horribly inelegant. It's almost as if we stored the four-letter OT script tags in the "name" table rather than in the GSUB or GPOS table, and used some "anonymous" numerical IDs in the GSUB table to refer to them.

While OpenType already is full of inconsistencies, full of sloppiness and of bad conceptual design (which we can safely credit to the long evolution and complex developments on the way), there is still a certain "conceptual design spirit" behind TrueType/OpenType, and I think when proposing new ideas, especially TODAY (with all the experience we have gained from previous developments), we should not sacrifice good conceptual design of the format just to implement more "hacks". And to me, putting machine-readable language tags into the "name" table looks a bit like a hack.

It's not a horrible hack, and the MS proposal is "okayish" in terms that it's "not horrible", but it certainly is short of elegance. (I mean:
technically. I fully agree that the underlying need to solve the problem is real, and that we should solve it somehow.)

Best,
Adam

On 13-01-08 22:09, Martin Hosken wrote:
> Message from OpenType list:
>
>
> On Tue, 8 Jan 2013 19:21:53 +0000
> "Jelle Bosma " <jelleb at euronet.nl> wrote:
>
>> Message from OpenType list:
>>
>>
>>
>> Op 8-jan-2013, om 17:20 heeft Martin Hosken het volgende geschreven:
>>>
>>>> What multi language scripts with shape variants am I missing, that 
>>>> we risk running out of bits?
>>> I think there are two questions that are being conflated here:
>>>
>>> 1. Does this font have the character set coverage needed for writing 
>>> system X.
>>>
>>> 2. Is this font styled appropriately for writing system X.
>>>
>>> The first can be answered by cmap query given appropriate 
>>> information to the application about the character set coverage 
>>> requirements for X.
>> Hi,
>>
>> This is not about writing systems, but about languages within writing 
>> systems.
> Ah well, this is where the problems of terminology bite us. To me a writing system is the intersection of a language and a script. So you can't talk about a language within a writing system (to me). A language within a script is a writing system. Now I realise you have a different interpretation of the term.
>
>> So if you have a Chinese font that has all Kana, Kanji and Latin 
>> characters that are needed for Japanese, you would not be able to use 
>> it for Japanese, because the construction of the ideographs is 
>> different. So here you need extra information in the font to mark it 
>> as Chinese so that you can substitute a proper Japanese font instead, 
>> if the original font is not available to display a document.
> This is an example of issue 2 above.
>
>> But if you have Latin font designed in Portugal, with all the 
>> characters needed to write Swiss German, you can safely use it for
>> Swiss German and there is no need to substitute it for Helvetica.   
>> The design may or may not be liked, but the construction of the 
>> characters is the same and legible. It would have been a problem if 
>> Unicode had decided to share the Latin character codes with Greek and 
>> Cyrillic, but they haven't.  ;-)
> And this is an example of issue 1.
>
>> So you would never have to add tags to a Latin font to list all the 
>> languages that it supports, because that information is in the cmap, 
>> if it is not a language which isn't covered by one of the OS/2 code 
>> pages already. And that must be true for most writing systems. You 
>> only need tags or some other identifier, when a script has characters 
>> that have shape variants that are not legible in all languages that 
>> are written in that script. Which narrows it down. Of course the GSUB 
>> already provides a mechanism of sorts with the option for language 
>> specific substitution.
> And my suggestion is to use the GSUB lang to flag that.
>
> Yours,
> Martin
>
>
>
> List archive: http://www.indx.co.uk/biglistarchive/
>
> subscribe: opentype-migration-sub at indx.co.uk
> unsubscribe: opentype-migration-unsub at indx.co.uk
> messages: opentype-migration-list at indx.co.uk
>
>

-- 

May success attend your efforts,
-- Adam Twardoch
(Remove "list." from e-mail address to contact me directly.)

List archive: http://www.indx.co.uk/biglistarchive/

subscribe: opentype-migration-sub at indx.co.uk
unsubscribe: opentype-migration-unsub at indx.co.uk
messages: opentype-migration-list at indx.co.uk