[mpeg-OTspec] Feedback on CFR

Ken Lunde lunde at adobe.com
Tue Jan 25 04:07:10 CET 2011


Tony,

I am back at work, and can now address Doug's comments that you deferred to me:

>> - I don't see the need for the ToUnicode element's fromEncoding attribute. Composite fonts should not need to support components that require custom encoding behavior.
> 
> I would like Ken Lunde from Adobe to address this.

There is definitely a need for the ToUnicode element's fromEncoding attribute, specifically when the component font is not Unicode. The CFR specification intentionally does not state that the component fonts need to be OpenType, meaning that legacy fonts can be specified. Of course, it is up to the consumer to be able to handle such fonts. When the component font does not include a Unicode 'cmap' subtable, transcoding must be leveraged to use them in a CFR object, meaning that the legacy encoding specified by the ToUnicode element's fromEncoding attribute must be converted into Unicode.

>> - ToUnicode maps from 'sequences' of codepoints to 'sequences' of code points, yet the term 'sequence' is a bit ambiguous here. Are these sequences, or ordered sets? If sequences, then if UnicodeSet notation is used, then are these patterns and not strict sequences?  And if ordered sets, do the from and to values need to be the same length? If not, what happens if they are not? (One can imagine, for instance, using this to map multiple codepoints onto a single replacement codepoint, but it's not clear if that use is intended.)
> 
> Again, I would like Ken to respond to this. My understanding is this is used for CID fonts.

It is probably better to state that ToUnicode "can" map to/from sequences when both sequences are contiguous. Singleton mappings can also be specified, even if both sequences are contiguous.

Regards...

-- Ken

On Jan 18, 2011, at 4:10 PM, Tony Tseung wrote:

> Vladamir,
> 
> Thanks for posting the feedbacks on CFR. My response to the feedback from Doug Felt is below.
> 
> Tony
> __________
> 
> Dear Doug,
> 
> Long long time no see (since Taligent). Thank you for the feedbacks.
> 
> On Jan 17, 2011, at 9:21 PM, Levantovsky, Vladimir wrote:
> 
>> Feedback from Doug Felt:
>> 
>> - 'prefered' is not the preferred spelling of 'preferred'
> Agreed. I will correct the spelling.
> 
>> - The name and metrics fields are essentially taken from parts of the head, os/2, hhea, and vhea tables in the opentype spec. this spec must be referenced.
> For some, we actually did but since this is a new spec without the preexisting notion of sent TrueType, we decided not to. We could make reference to the TrueType spec or OpenType spec appropriately in the reference section of the spec.
> 
>> - Language is defined as using ISO 639 codes, this should use BCP47 codes. In particular, it should not be restricted to two letters, and should accept script tags.
> 
> Agreed, Mark Davis had also raised this point. It's well taken.
> 
>> - LanguagePreferedList (sic) is described as containing two or more LanguagePreferredComponentDef instances, this should be one or more.
> 
> Agreed, to update spec.
> 
>> - LanguagePreferredComponentDef should have language as a required attribute, not as an element.
> 
> True but element is about the same as required attribute ether way. Just the different parsing code. Since we had precedence of this. I try not to change except to truly wrong usage.
> 
>> - It is not clear to me why ComponentDef needs to allow more than one UnicodeCharSet element.
> 
> Mainly for convenience. At the end, the concatenation of the characters to form a single character set for the component and there is not order within a component.
> 
>> - I don't see the need for the ToUnicode element's fromEncoding attribute. Composite fonts should not need to support components that require custom encoding behavior.
> 
> I would like Ken Lunde from Adobe to address this.
> 
>> - The examples section is not normative yet that is the only place where the languagepreferredcomponentdef usage is described. There must be a normative section detailing how language and unicode character sequence are used together with the font spec to select a font and glyph id(s).
> 
> Agreed, we should add explanation of the expected behaviour in the spec.
> 
>> - The phrase 'unicode code point sequence' is used in conjunction with UnicodeSet. I'm assuming these are sequences of characters or UnicodeSet expressions in a single string. Since UnicodeSets are delimited by '[' and ']', is there a way to specify these characters without defining a UnicodeSet? Or must they be defined within a UnicodeSet?
> 
> We meant UnicodeSet. Is there a nomanclatural term for this?
> 
>> - UnicodeSet allows strings as elements. Are these allowed?
> 
> Yes, perhaps I should recommend the use of UCI to parse these.
> 
>> - UnicodeSet allows the use of unicode properties, which in turn depend on the version of Unicode. Are these allowed? Does the platform define the version of Unicode used?
> 
> Yes, yes. How do you specify the unicode version in strings (texts)?
> 
>> - ToUnicode maps from 'sequences' of codepoints to 'sequences' of code points, yet the term 'sequence' is a bit ambiguous here. Are these sequences, or ordered sets? If sequences, then if UnicodeSet notation is used, then are these patterns and not strict sequences?  And if ordered sets, do the from and to values need to be the same length? If not, what happens if they are not? (One can imagine, for instance, using this to map multiple codepoints onto a single replacement codepoint, but it's not clear if that use is intended.)
> 
> Again, I would like Ken to respond to this. My understanding is this is used for CID fonts.
> 
>> - I'm assuming the cmap is definitive as to what characters are presumed supported by the font. So for example if the actual composite font does not support the character in its camp (even though specified in the ComponentDef) this font is still resolved as far as font lookup though the composite is concerned (and the result will be that component's missing glyph). An explicit statement would be useful.
> 
> Your assumption is correct. Mark also requested some explanations. We are to add to the spec soon.
> 
>> - In general it's desirable to choose glyphs from the same font whenever possible. Composite fonts can get in the way of this process.  For example, combining marks (u+0300 block) should come from the font the base character comes from, and not be iverly specified by the composite character mapping table. Enclosing punctuation is often best obtained from the font that the surrounding characters from from as well, and of course paired punctuation should always come from the same font. It's not clear how these issues should be dealt with by people employing composite fonts.
> 
> Agreed in principle but this is a simple mechanism to enable the rendering a character which would normally be rendered as the '.notdef' glyph (0) in a font. The idea is to find a (component) font that has something that can be used in place of the character. This duty is now given to the authors of this composite font to carefully construct a schema to allow the most appealing fallback font that include the consideration of these typographic features. We had considerations of cohesive 'feat' and 'mort' actions and then just drop them from the features of this as they would pollute the simplicity of the mechanism.
> 
> Regards
> 
> Tony
> 
> 




More information about the mpeg-otspec mailing list