[mpeg-OTspec] Re: New work on 3rd edition of the OFF (AHG kick-off) - name table

Tue Aug 21 22:00:38 CEST 2012

One issue Bob's comments raise has to do with the way that platform and encoding IDs are used both for name records and cmap subtables. In a cmap subtable, the difference between UCS-2 and UTF-16 is really important since specific formats would be needed to support UTF-16. In contrast, there's nothing that would necessarily need to be different for name table data structures. In fact, I doubt that there's anywhere in the Windows platform where a name table string might get processed that would assume UCS-2 and _not_ UTF-16.

Hence, there might not be any problem if the spec were to state that 3/1 _or_ 3/10 name strings are assumed to be encoded as UTF-16; or even further, to stipulate that 3/10 should not be used in name records and that 3/1 name strings are assumed to be UTF-16.

Peter

From: mpeg-OTspec at yahoogroups.com [mailto:mpeg-OTspec at yahoogroups.com] On Behalf Of Levantovsky, Vladimir
Sent: August 8, 2012 8:28 AM
To: bobh528; mpeg-OTspec at yahoogroups.com
Subject: RE: [mpeg-OTspec] Re: New work on 3rd edition of the OFF (AHG kick-off) - name table

Hi Bob,

Thank you very much for taking the time to review the draft and for your comments.
Aside from the changes in OS/2 Panose field and new 'rclt' feature description, all other changes you currently see in the draft are rolled in from already issued and approved prior amendments and corrigendum. The text of the 'name' table description hasn't been modified at all recently, the last changes we made were discussed back in 2009/2010 when the second amendment was finalized. I verified that the current text is the exact match of OT 1.6 (http://www.microsoft.com/typography/otspec/name.htm) - with the exception of the example page (http://www.microsoft.com/typography/otspec/namesmp.htm) that is nested in the HTML version of OT1.6 and 'inlined' in the ISO text.

I agree with you that there are quite a few places where the current 'name' table text could be improved - in fact, the total re-write of this section was already proposed by Josh Hadley earlier this year: http://tech.groups.yahoo.com/group/mpeg-OTspec/message/714
Now may be a good time to discuss it in details and see if we can improve this section of the spec while the editing period is still open (until 8/31/12). However, it's not "now or never" kind of deal so I don't want anyone to feel rushed to make changes - the clarity of the spec is what matters so if it takes us little longer to finalize it - it's fine (this is what the working drafts are for).

Thank you,
Vlad

From: mpeg-OTspec at yahoogroups.com<mailto:mpeg-OTspec at yahoogroups.com> [mailto:mpeg-OTspec at yahoogroups.com] On Behalf Of bobh528
Sent: Tuesday, August 07, 2012 6:04 PM
To: mpeg-OTspec at yahoogroups.com<mailto:mpeg-OTspec at yahoogroups.com>
Subject: [mpeg-OTspec] Re: New work on 3rd edition of the OFF (AHG kick-off) - name table

(sorry -- previous post seems to have gone astray...)

On 2012-07-27 at 15:06 Levantovsky, Vladimir wrote:
I would like to ask you to review the first draft text

Thanks for getting this process going.

I have some questions about the spec for the name table.

1) In section 5.2.6.3 Name IDs, below the table of name IDs, is a Note in which the text:
All 'name' table strings for platform ID 3 (Windows platform) must be in Unicode, using the UTF-16 encoding form.  The character set encding for 'name' table strings with platform ID 0 (Macintosh) is determined by the encoding ID.
has been replaced with:

Note that OS/2 and Windows both require that all name strings be defined in Unicode. Thus all 'name' table strings for platform ID = 3 (Windows) will require two bytes per character. Macintosh fonts require single byte strings.

This appears to be a regression to the text from MS spec 1.6 -- is that intended?  If so, the "two bytes per character" phrase needs to be updated to modern language.

But in either case, a key question is whether SMP characters (coded using surrogate pairs) are permitted or not. If they are, then the correct term to use is "UTF-16". If they are not, then "UTF-16" is not the correct term -- I think the correct term would then be "UCS-2".

2) Section 5.2.6.2 5.2.6.2 Platform IDs, Platform-specific encoding IDs and Language IDs currently includes this table:

Windows platform-specific encoding IDs (platform ID= 3)
Platform ID

Encoding ID

Description

3

0

Symbol

3

1

Unicode BMP (UCS-2)

3

2

ShiftJIS

3

3

PRC

3

4

Big5

3

5

Wansung

3

6

Johab

3

7

Reserved

3

8

Reserved

3

9

Reserved

3

10

Unicode UCS-4

What does the third column of this table mean? In the context, it seems to be saying that if I want a name string with SMP characters in it, then I can use 3/10 encoding and encode the string in UCS-4.  Is that what it is really saying?  If this is true, then it goes counter to either of the quotes in my question 1 above (about UTF-16 or 2-byte characters).

Bob Hallissy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20120821/0c6b2c86/attachment.html>