From wjgo_10009 at btinternet.com Fri Jun 7 13:11:05 2024 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Fri, 7 Jun 2024 12:11:05 +0100 (BST) Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? Message-ID: <5bb4279a.ce63.18ff26522e7.Webtop.126@btinternet.com> There is a document ? (R)Unicode: Encoding and Sustainability Issues in Runology ? https://www.unicode.org/L2/L2024/24129-runology.pdf ? I have no expertise at all in Runology but I did notice one thing in the document that has prompted me to make an observation. ? On page 14 the document has the following. ? > Fonts supporting that form would display the appropriate glyph, > recognising the string of base-character + variation selector, but > fonts without such support would ignore the variation selector and > fall back to a generic form for the base character. ? I suggest that it would be possible to have a version of the program that displays the glyphs to be such that if the fall back glyph is displayed due to the requested glyph not being available in the font, then that fall back glyph could be displayed in, say, red, or some other way, so that it would be clear that that was the situation rather than it being just an unreported fallback situation. Is that possible with fonts and rendering systems as they are now? ? So it would not be a situation of the variation selector request being ignored, but a situation of the variation selector request not being acted upon yet a notification that the request had not been acted upon notified to the researcher. ? I appreciate that this would not be a feature requiring encoding in Unicode, but would involve the font and the rendering system. So how could this be implemented please, indeed are there any programs that already implement such a feature? If the font returns the default glyph when asked for the variation sequence requested glyph, does the rendering system know that this is the case? If so, how? If not, can a feature be added to the font specification so that the rendering system will know please? ? I am thinking that such a feature could b useful in various situations, not only runology. ? William Overington Friday 7 June 2024 ? ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From john at tiro.ca Fri Jun 7 17:28:47 2024 From: john at tiro.ca (John Hudson) Date: Fri, 7 Jun 2024 08:28:47 -0700 Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? In-Reply-To: <5bb4279a.ce63.18ff26522e7.Webtop.126@btinternet.com> References: <5bb4279a.ce63.18ff26522e7.Webtop.126@btinternet.com> Message-ID: The font wouldn?t need to do anything special in this scenario. The behaviour you describe is the sort of thing that is usually found at the application level, which is where text highlighting/colouring takes place, in concert with the shaping engine. It seems easy enough to do: check for presence of variation selector sequences, check for format 14 cmap mappings for those sequences in the current font, highlight any sequences without mappings. [In the context of the runology queries, I don?t think variation selectors are the best solution. There are similar issues in the study of historical texts in many writing systems, and the set of variants is usually too large and too open-ended to be suitable for Unicode?s strict definition of variation selector forms. The current version of the Brill Epichoric font, for instance, includes 31 forms of Alpha?not including RTL boustrophedon variants and stoichidon spacing forms?and it only takes discovery and publication of one more inscription to introduce one or more additional variants.] JH On 2024-06-07 4:11 am, William_J_G Overington via mpeg-otspec wrote: > > There is a document > > (R)Unicode: Encoding and Sustainability Issues in Runology > > https://www.unicode.org/L2/L2024/24129-runology.pdf > > I have no expertise at all in Runology but I did notice one thing in > the document that has prompted me to make an observation. > > On page 14 the document has the following. > > > Fonts supporting that form would display the appropriate glyph, recognising the string of > base-character + variation selector, but fonts without such support > would ignore the variation selector and fall back to a generic form > for the base character. > > I suggest that it would be possible to have a version of the program > that displays the glyphs to be such that if the fall back glyph is > displayed due to the requested glyph not being available in the font, > then that fall back glyph could be displayed in, say, red, or some > other way, so that it would be clear that that was the situation > rather than it being just an unreported fallback situation. Is that > possible with fonts and rendering systems as they are now? > > So it would not be a situation of the variation selector request being > ignored, but a situation of the variation selector request not being > acted upon yet a notification that the request had not been acted upon > notified to the researcher. > > I appreciate that this would not be a feature requiring encoding in > Unicode, but would involve the font and the rendering system. So how > could this be implemented please, indeed are there any programs that > already implement such a feature? If the font returns the default > glyph when asked for the variation sequence requested glyph, does the > rendering system know that this is the case? If so, how? If not, can a > feature be added to the font specification so that the rendering > system will know please? > > I am thinking that such a feature could b useful in various > situations, not only runology. > > William Overington > > > Friday 7 June 2024 > > > _______________________________________________ > mpeg-otspec mailing list > mpeg-otspec at lists.aau.at > https://lists.aau.at/mailman/listinfo/mpeg-otspec -- John Hudson Tiro Typeworks Ltdwww.tiro.com Tiro Typeworks is physically located on islands in the?Salish Sea, on the traditional territory of the?Snuneymuxw?and Penelakut First Nations. __________ EMAIL HOUR In the interests of productivity, I am only dealing with email towards the end of the day, typically between 4PM and 5PM. If you need to contact me more urgently, please use other means. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Fri Jun 7 21:46:14 2024 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Fri, 7 Jun 2024 20:46:14 +0100 (BST) Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? In-Reply-To: <5bb4279a.ce63.18ff26522e7.Webtop.126@btinternet.com> References: <5bb4279a.ce63.18ff26522e7.Webtop.126@btinternet.com> Message-ID: <78e21982.d93f.18ff43cc59f.Webtop.126@btinternet.com> Thank you. Best regards, William From htl10 at users.sourceforge.net Wed Jun 19 22:48:13 2024 From: htl10 at users.sourceforge.net (Hin-Tak Leung) Date: Wed, 19 Jun 2024 20:48:13 +0000 (UTC) Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? In-Reply-To: References: <5bb4279a.ce63.18ff26522e7.Webtop.126@btinternet.com> Message-ID: <505114297.12929920.1718830093799@mail.yahoo.com> It just so happened that somebody (likely one of the people lurking in this list, or parties closely associated with one) requested some addition to freetype-py - the python binding to FreeType - towards such a goal. Basically the requester asked that it should be possible to via freetype-py to retrieve the character + variant-selector -> glyph id mapping . So in the submitted example, this is one such result - the first number is just the character (without variant selector)'s glyph id, then followed by appending with 15 different variant selectors, with Source Han Sans: 40593|62913|62914|62915|62916|62917|62918|62919|62920|62921|62922|62923|62924|62925|62926|40593 Note that the last one is identical to without. equivalent answer can be retrieved via harfbuzz and fonttools. So it is relatively simple just to check if character + variant selector looks up to be the same id as without, to highlight that mapping for that selector is missing. #195/#196 on freetype-py's github tracker for those who want to know. My role in this was mainly just to approve the suggested code addition and examples , so I am not familiar with this area on the "context/history" side - I think it would be nice to have names/annotations for the variant selectors, e.g. "region" "historical forms", but that's unlikely the case? John's comment about more additional variants seems to corroborate that - the numbering / additon of variant selectors is a bit ad-hoc? On Friday 7 June 2024 at 16:29:08 BST, John Hudson via mpeg-otspec wrote: The font wouldn?t need to do anything special in this scenario. The behaviour you describe is the sort of thing that is usually found at the application level, which is where text highlighting/colouring takes place, in concert with the shaping engine. It seems easy enough to do: check for presence of variation selector sequences, check for format 14 cmap mappings for those sequences in the current font, highlight any sequences without mappings. [In the context of the runology queries, I don?t think variation selectors are the best solution. There are similar issues in the study of historical texts in many writing systems, and the set of variants is usually too large and too open-ended to be suitable for Unicode?s strict definition of variation selector forms. The current version of the Brill Epichoric font, for instance, includes 31 forms of Alpha?not including RTL boustrophedon variants and stoichidon spacing forms?and it only takes discovery and publication of one more inscription to introduce one or more additional variants.] JH On 2024-06-07 4:11 am, William_J_G Overington via mpeg-otspec wrote: There is a document ? (R)Unicode: Encoding and Sustainability Issues in Runology ? https://www.unicode.org/L2/L2024/24129-runology.pdf ? I have no expertise at all in Runology but I did notice one thing in the document that has prompted me to make an observation. ? On page 14 the document has the following. ? > Fonts supporting that form would display the appropriate glyph, recognising the string of base-character + variation selector, but fonts without such support would ignore the variation selector and fall back to a generic form for the base character. ? I suggest that it would be possible to have a version of the program that displays the glyphs to be such that if the fall back glyph is displayed due to the requested glyph not being available in the font, then that fall back glyph could be displayed in, say, red, or some other way, so that it would be clear that that was the situation rather than it being just an unreported fallback situation. Is that possible with fonts and rendering systems as they are now? ? So it would not be a situation of the variation selector request being ignored, but a situation of the variation selector request not being acted upon yet a notification that the request had not been acted upon notified to the researcher. ? I appreciate that this would not be a feature requiring encoding in Unicode, but would involve the font and the rendering system. So how could this be implemented please, indeed are there any programs that already implement such a feature? If the font returns the default glyph when asked for the variation sequence requested glyph, does the rendering system know that this is the case? If so, how? If not, can a feature be added to the font specification so that the rendering system will know please? ? I am thinking that such a feature could b useful in various situations, not only runology. ? William Overington Friday 7 June 2024 ? ? _______________________________________________mpeg-otspec mailing listmpeg-otspec at lists.aau.athttps://lists.aau.at/mailman/listinfo/mpeg-otspec -- John HudsonTiro Typeworks Ltd www.tiro.comTiro Typeworks is physically located on islands in the?Salish Sea, on the traditional territory of the?Snuneymuxw?and Penelakut First Nations.__________EMAIL HOURIn the interests of productivity, I am only dealing with email towards the end of the day, typically between 4PM and 5PM. If you need to contact me more urgently, please use other means. _______________________________________________ mpeg-otspec mailing list mpeg-otspec at lists.aau.at https://lists.aau.at/mailman/listinfo/mpeg-otspec -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Sat Jun 22 00:41:25 2024 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Fri, 21 Jun 2024 23:41:25 +0100 (BST) Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? In-Reply-To: References: Message-ID: <3115cd38.1694.1903cf62ea3.Webtop.159@btinternet.com> Thank you. ? Interestingly I noticed today in the Unicode Current Documents Register web page the following document title, though the document is not posted at the time of writing this post. ? Proposal to Encode a Set of 128 User-Defined Variation Selectors (WG2 N5266) ? However, searching for WG2 N5266 did find a document. ? https://www.unicode.org/wg2/docs/n5266-UVS_Proposal.pdf ? So using Variation Selectors could, if the proposal is accepted, become a much more attractive proposition for researchers. ? It did occur to me that, in my opinion, it would be better to encode 256 User-Defined Variation Selectors. ? William Overington ? Friday 21 June 2024 ? ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From john at tiro.ca Sat Jun 22 01:27:10 2024 From: john at tiro.ca (John Hudson) Date: Fri, 21 Jun 2024 16:27:10 -0700 Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? In-Reply-To: <3115cd38.1694.1903cf62ea3.Webtop.159@btinternet.com> References: <3115cd38.1694.1903cf62ea3.Webtop.159@btinternet.com> Message-ID: On 2024-06-21 3:41 pm, William_J_G Overington wrote: > It did occur to me that, in my opinion, it would be better to encode > 256 User-Defined Variation Selectors. Unless you are anticipating 256 variations of a single character needing to be captured in plain text, that would be significant overkill. Indeed, even if the arguments of the proposal document are accepted, I doubt if 128 codepoints would be assigned to implementing it. I suspect 64 would be plenty and 48 would /probably/ suffice. Remember: a) you only need enough variation selectors to address the number of variants, not the number of characters of which there are variants; and b) these would be the equivalent of PUA variation selectors, so with no guaranteed interoperability beyond private agreement among font makers and within user communities. So the variation selectors would be shared among characters and also across fonts for different fields and communities of scholars. JH -- John Hudson Tiro Typeworks Ltdwww.tiro.com Tiro Typeworks is physically located on islands in the?Salish Sea, on the traditional territory of the?Snuneymuxw?and Penelakut First Nations. __________ EMAIL HOUR In the interests of productivity, I am only dealing with email towards the end of the day, typically between 4PM and 5PM. If you need to contact me more urgently, please use other means. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Sat Jun 22 13:38:57 2024 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Sat, 22 Jun 2024 12:38:57 +0100 (BST) Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? In-Reply-To: <3115cd38.1694.1903cf62ea3.Webtop.159@btinternet.com> References: <3115cd38.1694.1903cf62ea3.Webtop.159@btinternet.com> Message-ID: <68a933bd.1b8a.1903fbe091b.Webtop.48@btinternet.com> ------ Original Message ------ From: john at tiro.ca To: wjgo_10009 at btinternet.com; htl10 at users.sourceforge.net; mpeg-otspec at lists.aau.at Sent: Friday, June 21st 2024, 00:27 Subject: Re: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? On 2024-06-21 3:41 pm, William_J_G Overington wrote: It did occur to me that, in my opinion, it would be better to encode 256 User-Defined Variation Selectors. Unless you are anticipating 256 variations of a single character needing to be captured in plain text, that would be significant overkill. Indeed, even if the arguments of the proposal document are accepted, I doubt if 128 codepoints would be assigned to implementing it. I suspect 64 would be plenty and 48 would probably suffice. Remember: a) you only need enough variation selectors to address the number of variants, not the number of characters of which there are variants; and b) these would be the equivalent of PUA variation selectors, so with no guaranteed interoperability beyond private agreement among font makers and within user communities. So the variation selectors would be shared among characters and also across fonts for different fields and communities of scholars. JH -- John Hudson Tiro Typeworks Ltd www.tiro.com Tiro Typeworks is physically located on islands in the?Salish Sea, on the traditional territory of the?Snuneymuxw?and Penelakut First Nations. __________ EMAIL HOUR In the interests of productivity, I am only dealing with email towards the end of the day, typically between 4PM and 5PM. If you need to contact me more urgently, please use other means. From wjgo_10009 at btinternet.com Sat Jun 22 14:45:00 2024 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Sat, 22 Jun 2024 13:45:00 +0100 (BST) Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? In-Reply-To: <3115cd38.1694.1903cf62ea3.Webtop.159@btinternet.com> References: <3115cd38.1694.1903cf62ea3.Webtop.159@btinternet.com> Message-ID: <51ce6976.1c6c.1903ffa817a.Webtop.48@btinternet.com> > Unless you are anticipating 256 variations of a single character > needing to be captured in plain text, that would be significant > overkill. Indeed, even if the arguments of the proposal document are > accepted, I doubt if 128 codepoints would be assigned to implementing > it. I suspect 64 would be plenty and 48 would probably suffice. > Remember: a) you only need enough variation selectors to address the > number of variants, not the number of characters of which there are > variants; and b) these would be the equivalent of PUA variation > selectors, so with no guaranteed interoperability beyond private > agreement among font makers and within user communities. So the > variation selectors would be shared among characters and also across > fonts for different fields and communities of scholars. JH -- John > Hudson Tiro Typeworks Ltd www.tiro.com Tiro Typeworks is physically > located on islands in the?Salish Sea, on the traditional territory of > the?Snuneymuxw?and Penelakut First Nations. __________ EMAIL HOUR In > the interests of productivity, I am only dealing with email towards > the end of the day, typically between 4PM and 5PM. If you need to > contact me more urgently, please use other means. ? ? ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From wjgo_10009 at btinternet.com Sat Jun 22 15:47:16 2024 From: wjgo_10009 at btinternet.com (William_J_G Overington) Date: Sat, 22 Jun 2024 14:47:16 +0100 (BST) Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? In-Reply-To: <3115cd38.1694.1903cf62ea3.Webtop.159@btinternet.com> References: <3115cd38.1694.1903cf62ea3.Webtop.159@btinternet.com> Message-ID: <68ad7b91.1d48.190403382b4.Webtop.48@btinternet.com> John Hudson wrote ? > Unless you are anticipating 256 variations of a single character > needing to be captured in plain text, ... ? I am anticipating the possibility that once this feature becomes available that new ways of carrying out research may become developed. ? For example, there was research on the Gutenberg Bible where glyphs for a letter are not precisely the same and it is thought that Gutenberg used a one-cast matrix system. I have wondered if that is the case, would using many ligatures be a cost saving practice? I have further wondered if there were unsuspected virtual ligatures such as pi and pl and so on made, two unconnected glyphs on the same piece of metal type. ? Unicode has 256 official Variation Selectors available, some in plane 0, most in plane 14, so the same for the user-defined variation selectors seems reasonable to me. ? Unicode traditionally uses blocks of 256 code points, so it seems to me that 256 user-defined variation selectors is a good idea. ? The history of information technology has examples of where a decision over what was necessary was later changed. ? Around 1980 it was decided that two digits was enough to specify a year. By the mid 1990s lots of software needed to be altered, at great cost and effort, as the year 2000 approached. If only four digits had been used from the start. ? Here is a link to the Unicode roadmap page for plane 14. ? https://www.unicode.org/roadmaps/ssp/ ? Having 256 user-defined variation selectors?would be a row full. If fewer than 256 are encoded, would the rest of the row ever be used for something else? ? Once these user-defined variation selectors?are implemented then they may well be used in ways not in the proposal. ? It would be possible to encode in plain text, each of italic glyphs, bold glyphs, bold italic glyphs, titling font glyphs (that is larger capital letters on the same point size body), Black Letter glyphs, red glyphs, and so on, all with a graceful fallback.. ? There is a famous print of the famous This is a Printing Office text where most of the text is printed in black ink and one line is printed in red ink. ? With these User-Defined Variation Selectors that text could now be typeset in plain text such that with an appropriate font the red lettering would be displayed yet with a graceful fallback if the text were displayed other than with such a font. ? William Overington ? Saturday 22 June 2024 ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From htl10 at users.sourceforge.net Sun Jun 23 01:07:11 2024 From: htl10 at users.sourceforge.net (Hin-Tak Leung) Date: Sat, 22 Jun 2024 23:07:11 +0000 (UTC) Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? In-Reply-To: <68ad7b91.1d48.190403382b4.Webtop.48@btinternet.com> References: <3115cd38.1694.1903cf62ea3.Webtop.159@btinternet.com> <68ad7b91.1d48.190403382b4.Webtop.48@btinternet.com> Message-ID: <1389707226.1712854.1719097631195@mail.yahoo.com> I would probably say ligature, italic and bold, and others which changes the spacing/advance width of a glyph, is perhaps abusing the unicode variation selector mechanism a bit. Those go into gsub, stylistic sets etc. Colour and other non-spacing-changing purpose is an interesting use - for example, I can quite see it being used for Arabic tashkil's: those have the fonts which supports it, display Arabic with colored tashkil's. That said, I think John's comment about it being an index in a vendor's PUA, and somewhat font vendor specific, and perhaps even font version specific (ie. the font vendor decides to re-order the selectors in a font upgrade) is a bit problematic. That means the font vendor needs to ship a release note per font release detailing what is and what is not available, and where in the code space? So it is quite difficult to think of a index into 256+ glyph variants of a character. As a side note, I think the whole traditional Chinese and simplified Chinese division should have been dealt with via something like this. Ie. Traditional/Simplified Chinese differ by glyh shape, not by meaning of the characters, but that ship has sailed decades ago :-). On Saturday 22 June 2024 at 14:47:36 BST, William_J_G Overington wrote: John Hudson wrote ? > Unless you are anticipating 256 variations of a single character needing to be captured in plain text, ... ? I am anticipating the possibility that once this feature becomes available that new ways of carrying out research may become developed. ? For example, there was research on the Gutenberg Bible where glyphs for a letter are not precisely the same and it is thought that Gutenberg used a one-cast matrix system. I have wondered if that is the case, would using many ligatures be a cost saving practice? I have further wondered if there were unsuspected virtual ligatures such as pi and pl and so on made, two unconnected glyphs on the same piece of metal type. ? Unicode has 256 official Variation Selectors available, some in plane 0, most in plane 14, so the same for the user-defined variation selectors seems reasonable to me. ? Unicode traditionally uses blocks of 256 code points, so it seems to me that 256 user-defined variation selectors is a good idea. ? The history of information technology has examples of where a decision over what was necessary was later changed. ? Around 1980 it was decided that two digits was enough to specify a year. By the mid 1990s lots of software needed to be altered, at great cost and effort, as the year 2000 approached. If only four digits had been used from the start. ? Here is a link to the Unicode roadmap page for plane 14. ? https://www.unicode.org/roadmaps/ssp/ ? Having 256 user-defined variation selectors?would be a row full. If fewer than 256 are encoded, would the rest of the row ever be used for something else? ? Once these user-defined variation selectors?are implemented then they may well be used in ways not in the proposal. ? It would be possible to encode in plain text, each of italic glyphs, bold glyphs, bold italic glyphs, titling font glyphs (that is larger capital letters on the same point size body), Black Letter glyphs, red glyphs, and so on, all with a graceful fallback.. ? There is a famous print of the famous This is a Printing Office text where most of the text is printed in black ink and one line is printed in red ink. ? With these User-Defined Variation Selectors that text could now be typeset in plain text such that with an appropriate font the red lettering would be displayed yet with a graceful fallback if the text were displayed other than with such a font. ? William Overington ? Saturday 22 June 2024 ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From htl10 at users.sourceforge.net Fri Jun 28 00:53:58 2024 From: htl10 at users.sourceforge.net (Hin-Tak Leung) Date: Thu, 27 Jun 2024 22:53:58 +0000 (UTC) Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? References: <459491008.18128671.1719528838728.ref@mail.yahoo.com> Message-ID: <459491008.18128671.1719528838728@mail.yahoo.com> This is probably fairly well-known among Adobe folks, and perhaps Google Noto folks too. I have added a little more code to the submitted example on Adobe Source Han Sans JP to dump some UVS statistics (hence probably applies to Noto CJK too). The current usage of it is this: just under 60,000 are base/canonical(?) single character glyphs. About 1400 characters maps to multiple glyphs via variant selectors. The highest is 15, the 2nd highest then is 8, and many with 2 to 3 variants. I guess the average for characters which have variants is under 4, and 60,000 + 4 ? 1400 ~ 65600 > 65535 .(We are getting over 64k glyph soon... hurray!) I have a look at some of them myself - some of the characters having variants are quite common - e.g. the "loong" character (as they tell you this year is the "year of loong", rather than "year of dragon", in Chinese Zodiac... the chinese loong is a majestic creature and quite different from the evil western dragon...) and first name of the pianist Lang-Lang (the surname and first name are transliterated to the same English phrase but different characters, and one of them have a few glyph variants). They aren't really exotic variants - most native people would recognise and accept the different variants as valid, while having an individual/regional choice of which to use. A bit like spelling "favourite/favorite" etc. The order/numbering of the variants are a bit ad-hoc though (and it differs from have 2 to having 15), so it is probably going to be vendor and also font version specific. And remember 1400 is a small number compare to 60,000. Back to the original question - it is pretty fast computationally to see glyph id for character with or without selector agree, or missing. It is more a UI/application issue than the rendering system's. I don't quite get the construction of Adobe Source Hans Sans - the look-up is not minimal - I.e. not all selectors are distinct, some just map back to the "base" glyph - and it is not exhaustive either (filling in the "upper" selectors by mapping to the base). I don't expect the latter to be the case, as it wastes spaces, but I sort of expect the former - I.e. the selectors should be distinct and minimal. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lunde at unicode.org Fri Jun 28 04:21:01 2024 From: lunde at unicode.org (Ken Lunde) Date: Thu, 27 Jun 2024 20:21:01 -0600 Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? In-Reply-To: <459491008.18128671.1719528838728@mail.yahoo.com> References: <459491008.18128671.1719528838728.ref@mail.yahoo.com> <459491008.18128671.1719528838728@mail.yahoo.com> Message-ID: <34D8BB87-C9DF-4E9C-9FA6-A2D562204D1D@unicode.org> Hin-Tak, For better or worse, I am effectively the caretaker of the history of much of the CJK-related type activities that took place at Adobe over the last 30+ years, to include the development of the Source Han and Noto CJK Pan-CJK typefaces, which are clones of one another. About the observations that you made, particularly about the lookup of UVSes in Source Han being suboptimal, that was intentional. While I have been the IVD Registrar since May of 2011, the registration of virtually all Adobe-Japan1 IVSes was performed by my former Adobe colleague, Eric Muller. I suspect that your observation is about the Variation Selector that is associated with what is deemed the default UVS, meaning that the Format 14 'cmap' subtable defers to the Format 12 (or 4) 'cmap' subtable for the GID. When the first -- and by far, largest -- batch of Adobe-Japan1 IVS were registered in the IVD, it was intentional that the lowest -- by code point order -- Variation Selector was not associated with the UVS that is considered the default (aka encoded) one. This was purposefully done so that implementations would not make such an assumption. BTW, you may be interested in the "IVS Test" project that I started while at Adobe: https://github.com/adobe-fonts/ivs-test/ Regards... -- Ken > On Jun 27, 2024, at 16:53, Hin-Tak Leung via mpeg-otspec wrote: > > This is probably fairly well-known among Adobe folks, and perhaps Google Noto folks too. I have added a little more code to the submitted example on Adobe Source Han Sans JP to dump some UVS statistics (hence probably applies to Noto CJK too). The current usage of it is this: just under 60,000 are base/canonical(?) single character glyphs. About 1400 characters maps to multiple glyphs via variant selectors. The highest is 15, the 2nd highest then is 8, and many with 2 to 3 variants. I guess the average for characters which have variants is under 4, and 60,000 + 4 ? 1400 ~ 65600 > 65535 . > (We are getting over 64k glyph soon... hurray!) > > I have a look at some of them myself - some of the characters having variants are quite common - e.g. the "loong" character (as they tell you this year is the "year of loong", rather than "year of dragon", in Chinese Zodiac... the chinese loong is a majestic creature and quite different from the evil western dragon...) and first name of the pianist Lang-Lang (the surname and first name are transliterated to the same English phrase but different characters, and one of them have a few glyph variants). They aren't really exotic variants - most native people would recognise and accept the different variants as valid, while having an individual/regional choice of which to use. A bit like spelling "favourite/favorite" etc. > > The order/numbering of the variants are a bit ad-hoc though (and it differs from have 2 to having 15), so it is probably going to be vendor and also font version specific. And remember 1400 is a small number compare to 60,000. > > Back to the original question - it is pretty fast computationally to see glyph id for character with or without selector agree, or missing. It is more a UI/application issue than the rendering system's. > > I don't quite get the construction of Adobe Source Hans Sans - the look-up is not minimal - I.e. not all selectors are distinct, some just map back to the "base" glyph - and it is not exhaustive either (filling in the "upper" selectors by mapping to the base). I don't expect the latter to be the case, as it wastes spaces, but I sort of expect the former - I.e. the selectors should be distinct and minimal. > _______________________________________________ > mpeg-otspec mailing list > mpeg-otspec at lists.aau.at > https://lists.aau.at/mailman/listinfo/mpeg-otspec From htl10 at users.sourceforge.net Sun Jun 30 04:03:10 2024 From: htl10 at users.sourceforge.net (Hin-Tak Leung) Date: Sun, 30 Jun 2024 02:03:10 +0000 (UTC) Subject: [MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font? In-Reply-To: <34D8BB87-C9DF-4E9C-9FA6-A2D562204D1D@unicode.org> References: <459491008.18128671.1719528838728.ref@mail.yahoo.com> <459491008.18128671.1719528838728@mail.yahoo.com> <34D8BB87-C9DF-4E9C-9FA6-A2D562204D1D@unicode.org> Message-ID: <87213699.19147171.1719712990690@mail.yahoo.com> On Friday 28 June 2024 at 06:12:44 BST, Ken Lunde wrote: > Hin-Tak, > For better or worse, I am effectively the caretaker of the history of much of the CJK-related type activities that took place at Adobe over the last 30+ years, to include the development of the Source Han and Noto CJK Pan-CJK typefaces, which are clones of one another. > About the observations that you made, particularly about the lookup of UVSes in Source Han being suboptimal, that was intentional. While I have been the IVD Registrar since May of 2011, the registration of virtually all Adobe-Japan1 IVSes was performed by my former Adobe colleague, Eric Muller. I suspect that your observation is about the Variation Selector that is associated with what is deemed the default UVS, meaning that the Format 14 'cmap' subtable defers to the Format 12 (or 4) 'cmap' subtable for the GID. When the first -- and by far, largest -- batch of Adobe-Japan1 IVS were registered in the IVD, it was intentional that the lowest -- by code point order -- Variation Selector was not associated with the UVS that is considered the default (aka encoded) one. This was purposefully done so that implementations would not make such an assumption. > BTW, you may be interested in the "IVS Test" project that I started while at Adobe: >?https://github.com/adobe-fonts/ivs-test/ Thanks Ken, for the anecdotes about the development history. I am aware that technical decisions are often made not entirely based on technical considerations. It may not even be optimal at the time, and certainly not on hindsight. It is always interesting to learn how "oddities" come to be. It makes a lot of sense to intentionally NOT to associate the lowest variation selector with the default. Technologically it is redundant (one can save one code point by just "spec it out" and remove it and gain the use of one empty slot). A lot of parties are going to argue that they want their favourite as default so "default" in this case is a political minefield too. I was curious about the non-optimalness of the format 14 cmap on Adobe Sources Hans Sans, and wonder if they are sync with the Serif font. I.e. two glyph shapes can be non-degenerate and different in the serif font (e.g. a brush stroke tapering from top right to bottom left, vs the reverse tapering from bottom left to top right - they become identical in the Sans font). But I found that the serif font has an entirely different versioning and release schedule. While its UVS table feels more optimal, no conclusion could be drawn from its relationship with the Sans font. There is probably another interesting story there. Thanks for the URL for the ivs-test - looks to be an interesting "stress test" benchmarking sample for performance in related software/ code path! Regards,Hin-Tak -------------- next part -------------- An HTML attachment was scrubbed... URL: