[mpeg-OTspec] script tags update
Adam Twardoch (List)
list.adam at twardoch.com
Thu Feb 25 03:26:35 CET 2016
If we take Unicode as the normative source, then it would be great if the OT scripts list did provide names of scripts exactly as spelled in Unicode.
E.g.
http://www.unicode.org/Public/UCD/latest/ucd/Scripts.txt
uses the term "Oriya" (which is the old name), while
https://www.microsoft.com/typography/otspec/scripttags.htm
says:
Odia (formerly Oriya) orya
Odia v.2 (formerly Oriya v.2) ory2
This makes the table difficult to parse.
Basically, a column that cites the Unicode script names exactly as spelled would be probably best, i.e. "Syloti_Nagri" or "Ol_Chiki" etc.
A.
> On 25 Feb 2016, at 03:10, Adam Twardoch (List) <list.adam at twardoch.com> wrote:
>
>>
>> On 25 Feb 2016, at 00:59, Peter Constable petercon at microsoft.com [mpeg-OTspec] <mpeg-OTspec-noreply at yahoogroups.com> wrote:
>>
>>
>> Vlad, and others:
>>
>> In anticipation of Unicode 9, we’ve updated the table of script tags in the OpenType Layout tag registry with Unicode 9 additions. Also, somehow we had overlooked updating that page last year for Unicode 8, so those additional script tags got added as well. The page is now completely up to date for Unicode 9.
>>
>> https://www.microsoft.com/typography/otspec/scripttags.htm
>>
>> It would be good to have these reflected in the next update to 14496-22. I wouldn’t expect any inconsistencies between this and what is currently in 14496-22, but please take a look and let me know if you see any issues.
>
> Peter,
>
> I seem to recall that some time ago, the Microsoft Typography website had a table (that you put together, I think) which provided a useful informative mapping of the OT script tags to ISO 15924 script tags. I realize that this was never a perfect mapping, but I think quite a few people found it highly useful.
>
> I think language tags are more complex, there are too many ISO standards for language codes anyway, and the OT languagesystems are really quite distinct by themselves. Plus they're relatively rarely used.
>
> But given that libraries such as ICU perform run itemization and typically return the ISO 15924 script codes, while the OpenType Layout processor is supposed to apply features within the OT script system, I think a mapping would be desirable.
>
> This could be an "exception-based listing" only, similar to how this is solved in HarfBuzz:
>
> 1. ISO 15924 script tags to HB constants:
> https://github.com/behdad/harfbuzz/blob/master/src/hb-common.h
> 2. HB constants to OT script tags:
> https://github.com/behdad/harfbuzz/blob/master/src/hb-ot-tag.cc
>
> The bulk of conversion is fully algorithmic, i.e. the ISO and OT script tags are identical except the first-letter case. But there are exceptions. If the future intention is indeed to keep the new OT script tags in sync with ISO 15924 as much as possible, it would be wonderful if
> https://www.microsoft.com/typography/otspec/scripttags.htm
> did indeed include an "informative" column citing exceptions to the rule.
>
> Or otherwise, we could forego ISO and provide a mapping of the OT script tags to the Unicode script names as used in http://www.unicode.org/Public/UCD/latest/ucd/Scripts.txt , citing exceptions the same way.
>
> Since USE is now bringing OpenType and Unicode closer together, I think such mapping would be practical.
>
> Best,
> Adam
More information about the mpeg-otspec
mailing list