[mpeg-OTspec] Re: New work on 3rd edition of the OFF (AHG kick-off) - name table

Behdad Esfahbod behdad at behdad.org
Tue Aug 21 22:20:46 CEST 2012


I have wanted to raise an issue with 'name' table and Windows encodings for a
while.  To me, it looks like there's some confusion in the spec right now.
The 'name' table page on MS otspec [1] says:

"When building a Unicode font for Windows, the platform ID should be 3 and the
encoding ID should be 1. When building a symbol font for Windows, the platform
ID should be 3 and the encoding ID should be 0."

This is plain wrong.  I assume it was copy/pasted from the 'cmap' table.  The
platform/encoding ID for name table should match the encoding of the name
strings, not what kind of glyphs the font has.  Indeed, in fontconfig I had to
allow encoding ID 0 in 'name' table, to mean UTF-16BE...

Any clarification in the space will be appreciated.

In the same vein, the same page, platform ID 3 encoding ID 10 is called
Unicode UCS-4 (which makes sense for the 'cmap' table), but in the 'name'
table it probably should say UTF-16BE instead.

behdad


[1] http://www.microsoft.com/typography/otspec/name.htm


On 08/21/2012 04:00 PM, Peter Constable wrote:
>  
> 
> One issue Bob’s comments raise has to do with the way that platform and
> encoding IDs are used both for name records and cmap subtables. In a cmap
> subtable, the difference between UCS-2 and UTF-16 is really important since
> specific formats would be needed to support UTF-16. In contrast, there’s
> nothing that would necessarily need to be different for name table data
> structures. In fact, I doubt that there’s anywhere in the Windows platform
> where a name table string might get processed that would assume UCS-2 and
> _/not/_ UTF-16.
> 
> Hence, there might not be any problem if the spec were to state that 3/1
> _/or/_ 3/10 name strings are assumed to be encoded as UTF-16; or even further,
> to stipulate that 3/10 should not be used in name records and that 3/1 name
> strings are assumed to be UTF-16.
> 
>  
> 
>  
> 
> Peter
> 
>  
> 
> *From:*mpeg-OTspec at yahoogroups.com [mailto:mpeg-OTspec at yahoogroups.com] *On
> Behalf Of *Levantovsky, Vladimir
> *Sent:* August 8, 2012 8:28 AM
> *To:* bobh528; mpeg-OTspec at yahoogroups.com
> *Subject:* RE: [mpeg-OTspec] Re: New work on 3rd edition of the OFF (AHG
> kick-off) - name table
> 
>  
> 
>  
> 
> Hi Bob,
> 
>  
> 
> Thank you very much for taking the time to review the draft and for your comments.
> 
> Aside from the changes in OS/2 Panose field and new ‘rclt’ feature
> description, all other changes you currently see in the draft are rolled in
> from already issued and approved prior amendments and corrigendum. The text of
> the ‘name’ table description hasn’t been modified at all recently, the last
> changes we made were discussed back in 2009/2010 when the second amendment was
> finalized. I verified that the current text is the exact match of OT 1.6
> (http://www.microsoft.com/typography/otspec/name.htm) – with the exception of
> the example page (http://www.microsoft.com/typography/otspec/namesmp.htm) that
> is nested in the HTML version of OT1.6 and ‘inlined’ in the ISO text.
> 
>  
> 
> I agree with you that there are quite a few places where the current ‘name’
> table text could be improved – in fact, the total re-write of this section was
> already proposed by Josh Hadley earlier this year:
> http://tech.groups.yahoo.com/group/mpeg-OTspec/message/714
> 
> Now may be a good time to discuss it in details and see if we can improve this
> section of the spec while the editing period is still open (until 8/31/12).
> However, it’s not “now or never” kind of deal so I don’t want anyone to feel
> rushed to make changes – the clarity of the spec is what matters so if it
> takes us little longer to finalize it – it’s fine (this is what the working
> drafts are for).
> 
>  
> 
> Thank you,
> 
> Vlad
> 
>  
> 
>  
> 
>  
> 
> *From:*mpeg-OTspec at yahoogroups.com <mailto:mpeg-OTspec at yahoogroups.com>
> [mailto:mpeg-OTspec at yahoogroups.com] *On Behalf Of *bobh528
> *Sent:* Tuesday, August 07, 2012 6:04 PM
> *To:* mpeg-OTspec at yahoogroups.com <mailto:mpeg-OTspec at yahoogroups.com>
> *Subject:* [mpeg-OTspec] Re: New work on 3rd edition of the OFF (AHG kick-off)
> - name table
> 
>  
> 
> 
> 
> 
> 
> (sorry -- previous post seems to have gone astray...)
> 
>  
> 
> On 2012-07-27 at 15:06 Levantovsky, Vladimir wrote:
> 
>     I would like to ask you to review the first draft text
> 
> 
> Thanks for getting this process going.
> 
> I have some questions about the spec for the name table.
> 
> 1) In section 5.2.6.3 Name IDs, below the table of name IDs, is a Note in
> which the text:
> 
> All 'name' table strings for platform ID 3 (Windows platform) must be in
> Unicode, using the UTF-16 encoding form.  The character set encding for 'name'
> table strings with platform ID 0 (Macintosh) is determined by the encoding ID.
> 
> has been replaced with:
> 
> 
> Note that OS/2 and Windows both require that all name strings be defined in
> Unicode. Thus all 'name' table strings for platform ID = 3 (Windows) will
> require two bytes per character. Macintosh fonts require single byte strings.
> 
> 
> This appears to be a regression to the text from MS spec 1.6 -- is that
> intended?  If so, the "two bytes per character" phrase needs to be updated to
> modern language.
> 
> But in either case, a key question is whether SMP characters (coded using
> surrogate pairs) are permitted or not. If they are, then the correct term to
> use is "UTF-16". If they are not, then "UTF-16" is /not/ the correct term -- I
> think the correct term would then be "UCS-2".
> 
> 2) Section 5.2.6.2 5.2.6.2 /Platform IDs, Platform-specific encoding IDs and
> Language IDs/ currently includes this table:
> 
> 
> *Windows platform-specific encoding IDs (platform ID= 3)*
> 
> Platform ID
> 
> 	
> 
> Encoding ID
> 
> 	
> 
> Description
> 
> 3
> 
> 	
> 
> 0
> 
> 	
> 
> Symbol
> 
> 3
> 
> 	
> 
> 1
> 
> 	
> 
> Unicode BMP (UCS-2)
> 
> 3
> 
> 	
> 
> 2
> 
> 	
> 
> ShiftJIS
> 
> 3
> 
> 	
> 
> 3
> 
> 	
> 
> PRC
> 
> 3
> 
> 	
> 
> 4
> 
> 	
> 
> Big5
> 
> 3
> 
> 	
> 
> 5
> 
> 	
> 
> Wansung
> 
> 3
> 
> 	
> 
> 6
> 
> 	
> 
> Johab
> 
> 3
> 
> 	
> 
> 7
> 
> 	
> 
> Reserved
> 
> 3
> 
> 	
> 
> 8
> 
> 	
> 
> Reserved
> 
> 3
> 
> 	
> 
> 9
> 
> 	
> 
> Reserved
> 
> 3
> 
> 	
> 
> 10
> 
> 	
> 
> Unicode UCS-4
> 
> 
> What does the third column of this table mean? In the context, it seems to be
> saying that if I want a name string with SMP characters in it, then I can use
> 3/10 encoding and encode the string in UCS-4.  Is that what it is really
> saying?  If this is true, then it goes counter to /either /of the quotes in my
> question 1 above (about UTF-16 or 2-byte characters).
> 
> Bob Hallissy
> 
> 
> 
> 
> 
> 



More information about the mpeg-otspec mailing list