[mpeg-OTspec] Re: New work on 3rd edition of the OFF (AHG kick-off) - name table
Behdad Esfahbod
behdad at behdad.org
Tue Aug 21 22:20:46 CEST 2012
I have wanted to raise an issue with 'name' table and Windows encodings for a
while. To me, it looks like there's some confusion in the spec right now.
The 'name' table page on MS otspec [1] says:
"When building a Unicode font for Windows, the platform ID should be 3 and the
encoding ID should be 1. When building a symbol font for Windows, the platform
ID should be 3 and the encoding ID should be 0."
This is plain wrong. I assume it was copy/pasted from the 'cmap' table. The
platform/encoding ID for name table should match the encoding of the name
strings, not what kind of glyphs the font has. Indeed, in fontconfig I had to
allow encoding ID 0 in 'name' table, to mean UTF-16BE...
Any clarification in the space will be appreciated.
In the same vein, the same page, platform ID 3 encoding ID 10 is called
Unicode UCS-4 (which makes sense for the 'cmap' table), but in the 'name'
table it probably should say UTF-16BE instead.
behdad
[1] http://www.microsoft.com/typography/otspec/name.htm
On 08/21/2012 04:00 PM, Peter Constable wrote:
>
>
> One issue Bob’s comments raise has to do with the way that platform and
> encoding IDs are used both for name records and cmap subtables. In a cmap
> subtable, the difference between UCS-2 and UTF-16 is really important since
> specific formats would be needed to support UTF-16. In contrast, there’s
> nothing that would necessarily need to be different for name table data
> structures. In fact, I doubt that there’s anywhere in the Windows platform
> where a name table string might get processed that would assume UCS-2 and
> _/not/_ UTF-16.
>
> Hence, there might not be any problem if the spec were to state that 3/1
> _/or/_ 3/10 name strings are assumed to be encoded as UTF-16; or even further,
> to stipulate that 3/10 should not be used in name records and that 3/1 name
> strings are assumed to be UTF-16.
>
>
>
>
>
> Peter
>
>
>
> *From:*mpeg-OTspec at yahoogroups.com [mailto:mpeg-OTspec at yahoogroups.com] *On
> Behalf Of *Levantovsky, Vladimir
> *Sent:* August 8, 2012 8:28 AM
> *To:* bobh528; mpeg-OTspec at yahoogroups.com
> *Subject:* RE: [mpeg-OTspec] Re: New work on 3rd edition of the OFF (AHG
> kick-off) - name table
>
>
>
>
>
> Hi Bob,
>
>
>
> Thank you very much for taking the time to review the draft and for your comments.
>
> Aside from the changes in OS/2 Panose field and new ‘rclt’ feature
> description, all other changes you currently see in the draft are rolled in
> from already issued and approved prior amendments and corrigendum. The text of
> the ‘name’ table description hasn’t been modified at all recently, the last
> changes we made were discussed back in 2009/2010 when the second amendment was
> finalized. I verified that the current text is the exact match of OT 1.6
> (http://www.microsoft.com/typography/otspec/name.htm) – with the exception of
> the example page (http://www.microsoft.com/typography/otspec/namesmp.htm) that
> is nested in the HTML version of OT1.6 and ‘inlined’ in the ISO text.
>
>
>
> I agree with you that there are quite a few places where the current ‘name’
> table text could be improved – in fact, the total re-write of this section was
> already proposed by Josh Hadley earlier this year:
> http://tech.groups.yahoo.com/group/mpeg-OTspec/message/714
>
> Now may be a good time to discuss it in details and see if we can improve this
> section of the spec while the editing period is still open (until 8/31/12).
> However, it’s not “now or never” kind of deal so I don’t want anyone to feel
> rushed to make changes – the clarity of the spec is what matters so if it
> takes us little longer to finalize it – it’s fine (this is what the working
> drafts are for).
>
>
>
> Thank you,
>
> Vlad
>
>
>
>
>
>
>
> *From:*mpeg-OTspec at yahoogroups.com <mailto:mpeg-OTspec at yahoogroups.com>
> [mailto:mpeg-OTspec at yahoogroups.com] *On Behalf Of *bobh528
> *Sent:* Tuesday, August 07, 2012 6:04 PM
> *To:* mpeg-OTspec at yahoogroups.com <mailto:mpeg-OTspec at yahoogroups.com>
> *Subject:* [mpeg-OTspec] Re: New work on 3rd edition of the OFF (AHG kick-off)
> - name table
>
>
>
>
>
>
>
> (sorry -- previous post seems to have gone astray...)
>
>
>
> On 2012-07-27 at 15:06 Levantovsky, Vladimir wrote:
>
> I would like to ask you to review the first draft text
>
>
> Thanks for getting this process going.
>
> I have some questions about the spec for the name table.
>
> 1) In section 5.2.6.3 Name IDs, below the table of name IDs, is a Note in
> which the text:
>
> All 'name' table strings for platform ID 3 (Windows platform) must be in
> Unicode, using the UTF-16 encoding form. The character set encding for 'name'
> table strings with platform ID 0 (Macintosh) is determined by the encoding ID.
>
> has been replaced with:
>
>
> Note that OS/2 and Windows both require that all name strings be defined in
> Unicode. Thus all 'name' table strings for platform ID = 3 (Windows) will
> require two bytes per character. Macintosh fonts require single byte strings.
>
>
> This appears to be a regression to the text from MS spec 1.6 -- is that
> intended? If so, the "two bytes per character" phrase needs to be updated to
> modern language.
>
> But in either case, a key question is whether SMP characters (coded using
> surrogate pairs) are permitted or not. If they are, then the correct term to
> use is "UTF-16". If they are not, then "UTF-16" is /not/ the correct term -- I
> think the correct term would then be "UCS-2".
>
> 2) Section 5.2.6.2 5.2.6.2 /Platform IDs, Platform-specific encoding IDs and
> Language IDs/ currently includes this table:
>
>
> *Windows platform-specific encoding IDs (platform ID= 3)*
>
> Platform ID
>
>
>
> Encoding ID
>
>
>
> Description
>
> 3
>
>
>
> 0
>
>
>
> Symbol
>
> 3
>
>
>
> 1
>
>
>
> Unicode BMP (UCS-2)
>
> 3
>
>
>
> 2
>
>
>
> ShiftJIS
>
> 3
>
>
>
> 3
>
>
>
> PRC
>
> 3
>
>
>
> 4
>
>
>
> Big5
>
> 3
>
>
>
> 5
>
>
>
> Wansung
>
> 3
>
>
>
> 6
>
>
>
> Johab
>
> 3
>
>
>
> 7
>
>
>
> Reserved
>
> 3
>
>
>
> 8
>
>
>
> Reserved
>
> 3
>
>
>
> 9
>
>
>
> Reserved
>
> 3
>
>
>
> 10
>
>
>
> Unicode UCS-4
>
>
> What does the third column of this table mean? In the context, it seems to be
> saying that if I want a name string with SMP characters in it, then I can use
> 3/10 encoding and encode the string in UCS-4. Is that what it is really
> saying? If this is true, then it goes counter to /either /of the quotes in my
> question 1 above (about UTF-16 or 2-byte characters).
>
> Bob Hallissy
>
>
>
>
>
>
More information about the mpeg-otspec
mailing list