[MPEG-OTSPEC] Defining the text shaping working group’s scope

梁海 Liang Hai lianghai at gmail.com
Tue Aug 4 07:26:18 CEST 2020


Suzuki,

> Now I understand that the "text layout" is the legacy term assuming the simple scripts like (some) European and CJK, so its assumed coverage is the smaller than "text shaping".


Let me clarify my judgment about “text layout”:

I consider it to be a kinda legacy term *when* it’s used to refer to both the “layout” proper and what informed experts prefer to call “shaping” today. The “layout” proper is a valid concept and we still call it “layout”; we also call the softwares handling it “layout engines”.

“Text shaping” is preferred to be a level of abstraction separate from the traditional understanding of “text layout”, because it doesn’t make sense anymore to consider it to be a trivial part of the whole text layout concept as soon as you realize how complicated it is to shape complex scripts (ie, the scripts that are encoded in a graphically indirect way).

There’re various structures of the technical stack from encoding to display, and the stack is often not actually in a linear structure, but one simple visualization may look like this:

rasterization
layout
shaping
metadata
encoding

For European scripts and CJK, the shaping level is just typically too thin to stand out. It’s like, for people who work in command line interfaces with bitmap fonts, basically the whole stack is too think to even think about. It’s probably alright for an expert to be unfamiliar with the concept of “shaping”, as long as we collectively as the industry is well informed. Text shaping is critical to many encoded scripts because it’s where we pay the graphically abstract encoding’s debt for those scripts.

> Indeed. The coverage of the term is not the highest priority to me, at present. The coverage of the tasks is the highest priority. What I want to know was whether the issue raised by W3C is covered by the text shaping WG's scope. The answer is "covered" - right?

The so called “text shaping working group” we’re discussing now, is a collective effort just initiated by a short kickstart meeting. The goal is to resolve various issues, and the exact scope is gonna be driven by the need, participants’ interests, and practicality.

It’s not a formal ISO ah hoc group yet, and no one has a predefined scope for it either. You should feel free to propose whatever makes sense to you. Just try to get on this train if you think it’s the answer, and just build another train if you get kicked out.

Best,
梁海 Liang Hai
https://lianghai.github.io

> On Aug 4, 2020, at 12:03, suzuki toshiya <mpsuzuki at hiroshima-u.ac.jp> wrote:
> 
> Dear 梁海,
> 
>> “Text layout” is a kinda legacy term that originated from European and CJK experts’ notion that there’s nothing much beyond cmap for inline shaping and thus the whole display process of digital texts can be summarized as “layout”, from lines to paragraphs.
> 
> Oh! I found that I was too lazy to catch up the latest terminology. Now I understand that the "text layout" is the legacy term assuming the simple scripts like (some) European and CJK, so its assumed coverage is the smaller than "text shaping". It's my big misunderstanding. Thank you very much for clarification.
> 
>> Instead of trying to define coverage of terms, it’s much more helpful to just talk about issues. Whoever interested in an issue should take it up.
> 
> Indeed. The coverage of the term is not the highest priority to me, at present. The coverage of the tasks is the highest priority. What I want to know was whether the issue raised by W3C is covered by the text shaping WG's scope. The answer is "covered" - right?
> 
> Regards,
> mpsuzuki
> 
> On 2020/08/04 12:27, 梁海 Liang Hai wrote:
>> The de facto meaning of “text shaping” is basically giving digital texts (ie, text strings, typically encoded in Unicode) a visual form (shape), with whatever relevant additional information (language tagging, OTL feature switches…). It’s more about inline and plain text display. A typical example of text shaping is what the OpenType technology (cmap + OTL + …) and HarfBuzz does.
>> “Text layout” is a kinda legacy term that originated from European and CJK experts’ notion that there’s nothing much beyond cmap for inline shaping and thus the whole display process of digital texts can be summarized as “layout”, from lines to paragraphs. People dealing with complex scripts these days, however, tend to prefer the term “text shaping” when specifically referring to those inline transformation operations because they’re very complicated already and distinct from how rich text formats and lines and paragraphs are composed together.
>> Vertical layout (not only Japanese, and not only CJK) is sitting on the vague boundary. It’s been always considered a business more in the realm of line composition, however it’s becoming more and more clear it’s more complicated than that. For example, the boundaries between rotated and upright runs in vertical lines create a lot problems for correct text shaping.
>> Instead of trying to define coverage of terms, it’s much more helpful to just talk about issues. Whoever interested in an issue should take it up.
>> Best,
>> 梁海 Liang Hai
>> https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flianghai.github.io%2F&data=02%7C01%7Cmpsuzuki%40hiroshima-u.ac.jp%7C0683b10ad1794226c58908d838264e94%7Cc40454ddb2634926868d8e12640d3750%7C1%7C0%7C637321084458486079&sdata=Q8AcBU2F0EaLiRYLJAtWUsY5iCy0ybDV2IN14mBltao%3D&reserved=0
>>> On Aug 4, 2020, at 09:51, suzuki toshiya <mpsuzuki at hiroshima-u.ac.jp> wrote:
>>> 
>>> Dear AHG Convenor,
>>> 
>>> Maybe it's too late to ask such question, but please let me ask about the coverage of "Text Shaping". The first web page suggested by Google for "Text Shaping" is the document of harfbuzz library. I understand "Text Shaping" does the selection or extraction of the graphic data for an appropriate glyph, or grapheme, or ligature, or cluster by a specified single font instance, from the string of the coded character set. It does not assume the input is marked-up text like HTML.
>>> 
>>> I guess, "Text Shaping" is a part of "Text Layout", but some of "Text Layout" might be out of the scope of "Text Shaping". For example, the vertical layout of Japanese text, like,
>>> 
>>> On 2020/03/26 4:56, 'Levantovsky, Vladimir' vladimir.levantovsky at monotype.com [mpeg-OTspec] wrote:
>>>> 3.       Compatibility problem with ‘vert’
>>>> The discussion linked from https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.w3.org%2FArchives%2FPublic%2Fwww-archive%2F2019Dec%2F&data=02%7C01%7Cmpsuzuki%40hiroshima-u.ac.jp%7C0683b10ad1794226c58908d838264e94%7Cc40454ddb2634926868d8e12640d3750%7C1%7C0%7C637321084458496070&sdata=geLIUtMe71wwfDDQkPQUCynAIN55%2BDJOD0eInAirO%2B8%3D&reserved=0<https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.w3.org%2FArchives%2FPublic%2Fwww-archive%2F2019Dec%2F&data=02%7C01%7Cmpsuzuki%40hiroshima-u.ac.jp%7C0683b10ad1794226c58908d838264e94%7Cc40454ddb2634926868d8e12640d3750%7C1%7C0%7C637321084458496070&sdata=geLIUtMe71wwfDDQkPQUCynAIN55%2BDJOD0eInAirO%2B8%3D&reserved=0> seem to be specific to a particular behavior of existing version of Adobe InDesign application and Adobe Japanese font set. Considering the recent changes introduced by ISO/IEC 14496-22:2019/AMD1, I am not sure how much more (if anything) need to be done on the spec side. I do realize that both applications and fonts need to be updated to be compliant with new feature descriptions.
>>> 
>>> is covered by "Text Shaping" ? I believe, "Text Layout" covers it, but I'm not sure whether "Text Shaping" covers it.
>>> 
>>> Regards,
>>> mpsuzuki
>>> 
>>> _______________________________________________
>>> mpeg-otspec mailing list
>>> mpeg-otspec at lists.aau.at
>>> https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.aau.at%2Fmailman%2Flistinfo%2Fmpeg-otspec&data=02%7C01%7Cmpsuzuki%40hiroshima-u.ac.jp%7C0683b10ad1794226c58908d838264e94%7Cc40454ddb2634926868d8e12640d3750%7C1%7C0%7C637321084458496070&sdata=Yz9LNdbveHMbKzWGd%2F6bdVpxIQp8a8nqqEaumOBxt8c%3D&reserved=0

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20200804/6d2c3041/attachment-0001.html>


More information about the mpeg-otspec mailing list