JLREQ <-> fonts

Fri Oct 7 18:59:34 CEST 2011

In the model of Japanese layout presented in JLREQ [1], a number of 
characters such as the ideographic punctuations and the brackets are 
described as 1/2 em wide, and the spacing tables add some space around 
them as needed, typically an 1/2 em on one side, or 1/4 em on each side. 
See for example figure 64 in JLREQ; the blue cells are space which is 
added by the layout engine (and there is no space character in the 
text). Not apparent from the figure: the added space is elastic, and is 
adjusted as needed for the justification of the lines; and there is 
space added on each side of each character - you don't see them in the 
figure because the default width of many of those spaces is 0, but they 
can grow for justification.

The note 1 following that table states:

> In font <http://www.w3.org/TR/2009/NOTE-jlreq-20090604/#font> 
> implementations, punctuation marks 
> <http://www.w3.org/TR/2009/NOTE-jlreq-20090604/#punctuation-marks> can 
> be given a different character width, but it is expected that the font 
> is capable of following the line composition rules explained here to 
> produce the final result. For example, when opening brackets (cl-01) 
> <http://www.w3.org/TR/2009/NOTE-jlreq-20090604/#cl-01> and closing 
> brackets (cl-02) 
> <http://www.w3.org/TR/2009/NOTE-jlreq-20090604/#cl-02> are implemented 
> with full-width size, it is possible that a minus half em space is 
> inserted between adjacent closing brackets (cl-02) 
> <http://www.w3.org/TR/2009/NOTE-jlreq-20090604/#cl-02> and opening 
> brackets (cl-01) 
> <http://www.w3.org/TR/2009/NOTE-jlreq-20090604/#cl-01> (Some 
> implementations prepare minus half em 
> <http://www.w3.org/TR/2009/NOTE-jlreq-20090604/#half-em> and quarter 
> em <http://www.w3.org/TR/2009/NOTE-jlreq-20090604/#quarter-em> 
> spaces). In letterpress printing 
> <http://www.w3.org/TR/2009/NOTE-jlreq-20090604/#letterpress-printing>, 
> it was also common practice to combine punctuation marks with a 
> half-width body and half em spaces in order to make it easier to 
> remove the space later for adjustment. Because of that, the types were 
> picked up except for the punctuation marks at the type-picking 
> <http://www.w3.org/TR/2009/NOTE-jlreq-20090604/#type-picking> phase, 
> following the manuscript, and the punctuation marks were picked only 
> when they were necessary in composing a page. Later, with the 
> increasing adoption of Monotype machines, punctuation marks with a 
> full-width body became popular and both full-width and half-width 
> punctuation marks have been used, mixed together, since then.

It is indeed the practice in OpenType fonts to have the advance width of 
those glyphs at 1em and to place the ink accordingly. For example, the 
ink of a left parenthesis is in the right half of the em, and the left 
half is empty; the ink of a right parenthesis is in the left half of the 
em; the ink in a middle dot is in the center half em, with the left and 
right quarter em empty.

I suspect that in addition to the letterpress heritage, this allowed 
"western" layout engines to produce acceptable results. Essentially, the 
most typical space that is normally added by a JLREQ-aware engine is 
built in the glyph, and a non-JLREQ-aware engine will therefore produce 
an acceptable layout most of the time (in unjustified lines, at least).

The consequence for a JLREQ-aware layout engine is that it must 
fundamentally "remove" that built-in space. This means a table in the 
layout engine to record what to remove (how much, on which side) from 
which character.  While such a table is often adequate, there are some 
problems:

- first, this convention is not recorded anywhere in the OT specs, yet 
it is clearly crucial for the interoperability of layout engines and fonts

- proportional and non-square fonts present a challenge; should one 
remove 1/2em or 1/2 the advance width of the glyph

- the situation is a bit different for Chinese, where the ideographic 
comma and period are centered in the em, instead of being on the left 
half; yet, there is no reliable way for a layout engine to know if it 
deals with a Japanese font or with a Chinese font (not to mention 
pan-CJK fonts)

It seems to me that we would have a much more reliable system if we used 
an OT feature that would fundamentally deliver glyphs in the JLREQ 
model. I understand that this also begs for a Chinese equivalent of 
JLREQ; it's unclear to me whether we want/need a separate feature for 
Chinese. And then there is the question of Hangul as well.

Discussion?

Eric

[1] JLREQ : Requirements for Japanese Text Layout, W3 Working Group 
Note, http://www.w3.org/TR/2009/NOTE-jlreq-20090604/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20111007/8b6d1383/attachment.html>