[MPEG-OTSPEC] Does a rendering system know if a variation selector requested glyph is not available in a font?

Hin-Tak Leung htl10 at users.sourceforge.net
Fri Jun 28 00:53:58 CEST 2024


 This is probably fairly well-known among Adobe folks, and perhaps Google Noto folks too. I have added a little more code to the submitted example on Adobe Source Han Sans JP to dump some UVS statistics (hence probably applies to Noto CJK too). The current usage of it is this: just under 60,000 are base/canonical(?) single character glyphs. About 1400 characters maps to multiple glyphs via variant selectors. The highest is 15, the 2nd highest then is 8, and many with 2 to 3 variants. I guess the average for characters which have variants is under 4, and 60,000 + 4 × 1400 ~ 65600 > 65535 .(We are getting over 64k glyph soon... hurray!)
I have a look at some of them myself - some of the characters having variants are quite common - e.g. the "loong" character (as they tell you this year is the "year of loong", rather than "year of dragon", in Chinese Zodiac... the chinese loong is a majestic creature and quite different from the evil western dragon...) and first name of the pianist Lang-Lang (the surname and first name are transliterated to the same English phrase but different characters, and one of them have a few glyph variants). They aren't really exotic variants - most native people would recognise and accept the different variants as valid, while having an individual/regional choice of which to use. A bit like spelling "favourite/favorite" etc.
The order/numbering of the variants are a bit ad-hoc though (and it differs from have 2 to having 15), so it is probably going to be vendor and also font version specific. And remember 1400 is a small number compare to 60,000.
Back to the original question - it is pretty fast computationally to see glyph id for character with or without selector agree, or missing. It is more a UI/application issue than the rendering system's.
I don't quite get the construction of Adobe Source Hans Sans - the look-up is not minimal - I.e. not all selectors are distinct, some just map back to the "base" glyph - and it is not exhaustive either (filling in the "upper" selectors by mapping to the base). I don't expect the latter to be the case, as it wastes spaces, but I sort of expect the former - I.e. the selectors should be distinct and minimal.  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20240627/dbe9c021/attachment-0001.htm>


More information about the mpeg-otspec mailing list