[mpeg-OTspec] Armenian East-West Distinctions [1 Attachment]

Andrew Glass (WINDOWS) Andrew.Glass at microsoft.com
Wed Jul 10 19:01:53 CEST 2013


Hi John,

There are a couple of options here: make a new language tag or make a new script tag. In principle, the language is the same, so it is not appropriate to make a new language tag – I believe that ISO 639-3 registration authority would reject the request. The differences you refer to are in the orthography, and so a script difference is appropriate. There is precedent for script variants to be given distinct script codes within ISO 15924. See for example Syriac, Han, and Latin:

http://www.unicode.org/iso15924/iso15924-en.html

I would recommend that you check with Michael Everson and prepare a proposal to add a new variant script code for Armenian. The OpenType standard should then be updated to support the new script code once it becomes available.

I think it is important to go through ISO 15924 so that there can be a standard way for a user, web page, or document to identify a preference for this orthography. If the script code is in 15924 it will be picked up by BCP47. That will mean it can be included in language tags associated with user preferences, web pages, and documents. There may be remaining work for implementers to ensure that the preference expressed in the BCP47 language tag is available to the rendering engine that would trigger the appropriate lookups in the font.

Cheers,

Andrew


From: mpeg-OTspec at yahoogroups.com [mailto:mpeg-OTspec at yahoogroups.com] On Behalf Of John Hopkins
Sent: Friday, June 28, 2013 12:14
To: mpeg-OTspec at yahoogroups.com
Subject: [mpeg-OTspec] Armenian East-West Distinctions [1 Attachment]


[Attachment(s) from John Hopkins included below]
In January, I posted to the Mpeg-OTspec group some communication with Vlad Levantovsky regarding Armenian East-West distinctions we had been grappling with for years. For convenience, I include it here along with follow up conversation and add some information by my domain expert, George Simper as of today. I hope this will help motivate an additional OpenType language tag for Armenian West.

Best regards,
John D. Hopkins

Vlad,

…

For many years we have been dealing with distinctions between Armenian East (used inside Armenia) vs. Armenian West (used outside of Armenia, including the US). There are many differences between these two forms of the language and how they affect font glyphs. Our company's linguists say that Microsoft and Adobe products only use rules corresponding to Armenian West, but our needs are for Armenian East. So we have built fonts that favor the latter, which we use in-house for our own publications. But we also have older publications that are supposed to use the Armenian West. So we really need both.

There is no distinction between these two variants of Armenian using ISO 639-3 exclusively. Nor are there really any clear distinctions in OpenType's language tags. In the OpenType Spec, there is a script tag 'armn' and a language tag 'HYE ' for Armenian. There is also the 'dflt' language tag. We could really use a standard way of handling font glyph differences between these two variants of Armenian within OpenType. We have observed that under the 'armn' script tag, that both the 'dflt' and 'HYE ' tags use Armenian West rules in Microsoft and Adobe products. Perhaps an additional OpenType tag of 'HYE2' or 'HYEE' could be added that would correspond to Armenian East rules. (Another possibility is to use 'HYE ' for Armenian East and 'HYEW' for Armenian West. It does not really matter to us, as long there is a clear distinction in the OTSpec).

Again, thank you for your help. We have not known exactly who to turn to, to resolve these issues, though we have brought them up to both Microsoft and Adobe in years past, unfortunately without success. Maybe you and this group can help.

Best regards,
John D. Hopkins


>>>>>>>> I added further:

Note that IANA tags used for BCP-47 language tagging distinguish these two variants of Armenian:

Type: variant

Subtag: arevela

Description: Eastern Armenian

Added: 2006-09-18

Prefix: hy

%%

Type: variant

Subtag: arevmda

Description: Western Armenian

Added: 2006-09-18

Prefix: hy

%%
(from http://www.iana.org/assignments/language-subtag-registry)

>>>>>>>>> Vlad responded:

Thank you John,

I think the issues you brought up are important to address but they seem to require quite a bit of further discussions and work. I doubt that simply adding new tags would solve it by itself, we also need to understand the differences between languages and document them properly so that the companies like Adobe, Microsoft, Monotype (and many independent font vendors) would know what the problems are and how to solve them, what is required of implementations, etc. The lack of this understanding is what may have been hampered the progress on this issue – we definitely need to continue this discussion to find a solution.

Thank you,
Vlad

From: George Simper
Date: Fri, 28 Jun 2013 11:22:09 -0600
To: "John D. Hopkins"
Subject: Armenian-East and Armenian-West

John

The majority of the reference materials were provided in 2007 by by Timo Koponen, our language supervisor in Europe.

I.                    Pronunciation differences
As we saw on the site http://www.omniglot.com/writing/armenian.htm, the artwork for the two languages is the same, but the pronunciation for the artwork is different. Mostly changes of voiced (dza, ga) to voiceless (tza, ka). See the attached PDF for the various changes in pronunciation.

Recommendation for supporting spelling differences
With the pronunciation differences,  spellings will be different, requiring the need for different spell checkers and grammar checkers, unique language tags for Armenian-East and Armenian-West will be required in 639-3 to support them.

II.                  Ligatures
The biggest problem with ligatures is one glyph, U+0587 և, which is handled quite differently depending on which form of Armenian you are working with. See the tables below.

Armenian-East


Unicode

Artwork

Lower Case

0587

և

Upper Case

0535 + 057E

Ե + վ

ALL CAPS

0535 + 054E

Ե + Վ


OpenType rules for casing would require that 0587 be changed to 0535+057E for Initial capitals, and 0535+0543 for all capitals or small capitals.. It would also require that 0535+057E for Initial capitals, and 0535+054E for All caps and small caps be changed to 0587 when moving back to lower case.

Armenian-West


Unicode

Artwork

Lower Case

0587

և

Upper Case

0535 + 0582

Ե + ւ

ALL CAPS

0535 + 0542

Ե + Ղ


OpenType rules for casing would require that 0587 be changed to 0535+0582 for Initial capitals, and 0535+0542 for all capitals or small capitals. It would also require that 0535+057E for Initial capitals, and 0535+054E for All caps and small caps be changed to 0587 when moving back to lower case.

There are other Armenian ligatures in Unicode (U+FB13..FB17). I have not yet found my documentation on those glyphs to tell me if they are the same or different in East vs. West.

Additional Documentation Problem
The Unicode documentation for U+0587 indicates that it is a ligature of U+0587 and U+0582, which is true for Armenian-West, but not for Armenian-East.

Recommendation for Supporting Ligature Requirements
Where I have two different sets of instructions for the same Unicode value (U+0587), I must have separate language definitions for Armenian-East and Armenian-West to properly handle the casing, etc.

Hope this helps

George


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20130710/0d1d2fcb/attachment.html>


More information about the mpeg-otspec mailing list