<HTML>
<HEAD>
<TITLE>Re: [OpenType] Reverse chaining contextual lookup</TITLE>
</HEAD>
<BODY>
<FONT FACE="Verdana, Helvetica, Arial"><SPAN STYLE='font-size:12.0px'>Thank you, Sergey, for the additional perspective. <BR>
<BR>
I would like to explain how I conceive of this newly proposed reverse contextual-chaining mechanism. To do so most clearly, I’d like to start by summarizing in a simple manner the behavior of ordinary (forward moving) contextual chaining. Let’s visualize a typical text run as a row of seats in a theater. Suppose we have a row of 10 seats, A1—A10. Facing the row, we see A1 at one end and A10 on the other. We are searching for a pattern of A4-A5-A6 in the run. Beginning at A1, we step one glyph (one seat) at a time until we come to A6 and realize we have a match for the pattern. <BR>
<BR>
[start] A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 [end]<BR>
<BR>
For reverse contextual chaining, we are facing the same row of seats in the same way. A1 is still at one end and A10 at the other. Pattern A4-A5-A6 is in the same relative position in the run. Here’s how the forward and reverse versions differs: we begin scanning the run at A10, counting down one position at a time towards A1. When we reach A4, we see that we have matched the pattern A4-A5-A6. So, pattern matching proceeds in the same direction as in a forward-moving contextual lookup. Only the scanning is moving in the opposite direction. Once we’ve matched the pattern, we can choose what to do. If we want to perform a many-to-one (ligature) substitution, we would invoke the appropriate lookup with an index pointing to the start of the pattern, A4. If, for instance, we wished to make a change starting at the middle glyph of the pattern, we would point to A5 instead. <BR>
<BR>
What advantages does reverse contextual chaining give us? When we use forward-looking contextual lookup, we do not know how long any particular text run is. To “reach” the end of the run, we have to enumerate arbitrarily long patterns. In the above example, we would need to list a 10-glyph pattern to match the text run and identify its end. Had our longest pattern consisted of 9 glyphs only, we would not have matched the run.<BR>
<BR>
On the other hand, with a reverse contextual lookup, we know we are starting at the last glyph (A10) of the text run regardless of length, and will move backwards towards the start glyph (A1) in appropriate ‘chunks’. Once we are positioned at the last glyph, we can examine the run against expected patterns of the appropriate length. The following could be a typical traversal of the text run from end to start:<BR>
</SPAN></FONT><SPAN STYLE='font-size:12.0px'><FONT FACE="Courier New"><B><BR>
A8 A9 A10<BR>
A5 A6 A7 A8<BR>
A3 A4 A5<BR>
A1 A2 A3<BR>
</B></FONT><FONT FACE="Verdana, Helvetica, Arial"><BR>
<BR>
Kamal<BR>
<BR>
<BR>
On 2011.2.9 13:46, "Sergey Malkin" <sergeym@microsoft.com> wrote:<BR>
<BR>
</FONT></SPAN><BLOCKQUOTE><SPAN STYLE='font-size:12.0px'><FONT FACE="Verdana, Helvetica, Arial">Message from OpenType list:<BR>
<BR>
<BR>
When we designed reverse chaining lookup format, problem was in clear definition of desired behavior. In normal lookups, we always know that layout engine is going through characters and we know which character they are applied to. With reverse, this was not that clear. Consider following ligature substitution (contextual is not really needed for my example):<BR>
<BR>
AAB -> M<BR>
AB -> N<BR>
<BR>
And let's say input will be AAB.<BR>
<BR>
If we are moving forward, we will try lookups at position A, then at second A, and then at B. And this is clear that AAB sequence will be matched and substituted.<BR>
<BR>
If we are moving backwards, engine would check for sequences starting with B, then from second A, then from first A:<BR>
<BR>
- AB will be matched because it starts closer to where lookup starts iteration from end to back.<BR>
- AAB will be matched if we say both start at B and go towards beginning of the input string.<BR>
<BR>
We neither did see why one should be preferred over another, nor how it can be described clearly in the spec so it will be unambiguously defined. So we decided to restrict lookup to the case of single glyph, which is unambiguous. This lookup had single clear scenario of supporting Urdu script, and format we ended up with was perfectly enough.<BR>
<BR>
This is my recollection from 2001, when we defined reverse chaining lookup. So I may be wrong in some details, but this summarizes our thinking at that time.<BR>
<BR>
Thanks,<BR>
Sergey<BR>
<BR>
-----Original Message-----<BR>
From: listmaster@indx.co.uk [<a href="mailto:listmaster@indx.co.uk]">mailto:listmaster@indx.co.uk]</a> On Behalf Of Mansour, Kamal<BR>
Sent: Tuesday, February 08, 2011 2:00 PM<BR>
To: multiple recipients of OpenType<BR>
Subject: [OpenType] Reverse chaining contextual lookup<BR>
<BR>
Message from OpenType list:<BR>
<BR>
<BR>
>****** Attachments to this email message have been removed ******<BR>
<BR>
Use of Reverse Chaining<BR>
<BR>
The current definition of reverse-chaining single substitution recognizes that for some situations it is best to search backwards for a pattern in a run of text. This type of lookup was introduced specifically to cope with the complexities of the Nastaliq style of Arabic script. It turns out that a full implementation of other styles of Arabic writing, including Naskh, could also benefit from such a scan of a run of text.<BR>
<BR>
Upon matching a pattern, the current definition of the reverse-chaining lookup allows only a simple one-to-one substitution to take place. The following extract is from the OpenType spec:<BR>
<BR>
Reverse Chaining contextual single substitution, allows one glyph to be substituted with another by<BR>
chaining input glyph to a 'backtrack' and/or 'lookahead' sequence. The difference between this and<BR>
other lookup types is that processing of input glyph sequence goes from end to start.<BR>
<BR>
In many contexts, a simple substitution may not be enough to carry out the necessary changes. Moreover, the normal (i.e., forward-scanning) chaining contextual lookup permits a broader choice of actions when a pattern is matched, including the direct invocation of another lookup.<BR>
<BR>
I propose the addition of a new reverse-chaining, contextual-substitution lookup type which is identical in functionality to the ordinary contextual-substitution lookup except for directionality.<BR>
<BR>
I look forward to your comments.<BR>
<BR>
Kamal Mansour<BR>
Monotype Imaging<BR>
<BR>
<BR>
<BR>
>****** Attachments to this email message have been removed ******<BR>
<BR>
<BR>
<BR>
List archive: <a href="http://www.indx.co.uk/biglistarchive/">http://www.indx.co.uk/biglistarchive/</a><BR>
<BR>
subscribe: opentype-migration-sub@indx.co.uk<BR>
unsubscribe: opentype-migration-unsub@indx.co.uk<BR>
messages: opentype-migration-list@indx.co.uk<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
List archive: <a href="http://www.indx.co.uk/biglistarchive/">http://www.indx.co.uk/biglistarchive/</a><BR>
<BR>
subscribe: opentype-migration-sub@indx.co.uk<BR>
unsubscribe: opentype-migration-unsub@indx.co.uk<BR>
messages: opentype-migration-list@indx.co.uk<BR>
<BR>
<BR>
<BR>
</FONT></SPAN></BLOCKQUOTE><SPAN STYLE='font-size:12.0px'><FONT FACE="Verdana, Helvetica, Arial"><BR>
</FONT></SPAN>
</BODY>
</HTML>