[MPEG-OTSPEC] [EXTERNAL] New AHG mandates and other news!

Peter Constable pconstable at microsoft.com
Wed May 12 19:03:30 CEST 2021


William:

> I looked at the idea from a pure mathematics perspective… I am not a linguist…

At a purely conceptual level, thinking about potential semiotic systems, you have a well-formed idea: have some information token that represents a purely semantic proposition that can be used in a document / communication, and then processes at the input end and the display end can map that token from/to expressions in various languages.

But it isn’t really practical from a linguistic perspective. It assumes that semantic propositions can be defined that can always be used as translation pivots between two arbitrary languages, and that those can be formulated independent of any larger discourse context. It’s also not scoped: Adapting examples from your slides, I might want to send a message to the hotel saying, “If my dog has hair rather than fur, and doesn’t shed, will I be able to keep my dog in your hotel rooms?” There are an unbounded number of semantic propositions someone might want to communicate in your hotel scenario. There’s no clear way to scope.

Moreover…

> I am just told that it is 'out of scope'…

That’s because Unicode is an encoding for _characters / text elements_, not semantic propositions.


>> So, what I hear you now saying is that your proposal for localizable sentences does not need your proposed ‘text’ table…
>I am not saying that at all. I said that the font does not get transported with the message. The font would be resident in the receiving equipment ready to be used in converting the incoming binary encoded data of the email to the glyphs for the local display in the language for which the receiving equipment is set up.
If the idea is that the display strings for a the set of localizable sentences is resident in the receiving device, then that’s an instance of something that is frequently done: it’s very common for software running on devices to have access to lots of static strings to display messages to a user. However, fonts are not the storage mechanism used for such strings, and software vendors definitely would not want to start using fonts as a new way to store static strings for user-interface messages.
Strings in fonts should be about the font itself. Your proposal is not that.

> There could be applications in text to speech for emoji…

There are already implementations for that, and they don’t use fonts for storing the strings.



Peter


From: mpeg-otspec <mpeg-otspec-bounces at lists.aau.at> on behalf of William_J_G Overington <wjgo_10009 at btinternet.com>
Date: Wednesday, May 12, 2021 at 2:27 AM
To: 'MPEG OT Spec list' <mpeg-otspec at lists.aau.at>
Subject: [EXTERNAL] [MPEG-OTSPEC] New AHG mandates and other news!
> I should have been more careful and precise: after discussion on the email lists had made clear that your proposal would not gain the consensus needed to be adopted as part of Unicode, and after repeated attempts to float the idea were met with the same responses of opposition to the idea, and after repeated requests by the list administrator not to continue discussion of a proposal that had been consistently failed to garner support, further discussion of the topic was banned.



The ban was imposed by a fictional character, not by a named officer of Unicode Inc., so the ban is unfair. The ban has not been imposed by the Unicode Technical Committee. So it is not within the stated rules.



Looking back I feel that I made a mistake in the way that I presented the idea.



I looked at the idea from a pure mathematics perspective. So I started with a very basic trivial case. Asking if it is snowing. So yes, in a thought experiment, that message can be sent as one character and localized at the receiving end into another language. A reply can be sent as one character from that remote location and localized on the display device of the original sender.



So build on that trivial case.



Part of the culture in England is to talk about the weather. The weather in England is very changeable, due to its geographical location both near a large land area and a large ocean.



So, to explore the idea, I had dialogue through the language barrier about the weather.



Alas, lots of people do not think in that pure mathematics way and so maybe the idea seemed trivial.



I am not a linguist. I am interested in languages but I am not good at them. My background is in applied physics and mathematics.



Back in 2016, with progress in the doldrums, I decided to try to write a novel to put my ideas over. I brought back some story characters from some short stories that I wrote in the late 1990s. I had been to Creative Writing classes in 1997, gaining two certificates. These are at level 2 in the English qualification framework and are regarded as each equivalent to 0.2 of a GCSE subject qualification. I used what I had learned about Creative Writing to start the novel with action.



I published chapters on the web as I progressed, not always in numerical order. There was no overall plan, but I enjoyed writing it and it is deposited in The British Library where it is conserved. I completed the novel in February 2019. I missed writing it, so I started a sequel, which is a work in progress, with quite a lot of chapters already published on the web. Free to read, no registration required or requested. The webspace hosted on a server run by PlusNet PLC, not hosted on my computer.



Yet the research in the novel is real. For example, Chapter 21 of the first novel, the email to which reference is made is one that in real life was actually sent to me by a linguist.



There is also an author note after Chapter 21.



http://www.users.globalnet.co.uk/~ngo/novel_plus.htm



> As far as I know, no proposal document has been submitted to and taken up by the Unicode Technical Committee, but I am quite sure that, if a proposal were put on the agenda for a UTC meeting, it would not be accepted.



I have tried. The gatekeeper, no name stated, I do not know who he or she is, has refused to add the documents to the Current Document Register. I am just told that it is 'out of scope' bur no explanation for that ruling. So maybe nobody on the committee knows about this continual blocking of ideas.



Well, once there was such a document on the agenda but it got nowhere. I cannot remember if any decision was actually made or whether it was just left undecided. That was years ago.



What concerns me is that even before I have written a new proposal document, you are quite sure that it would not be accepted by the Unicode Technical Committee. I opine that there should be a fair assessment, based on the present state of the research, not prejudiced by opinions of twelve years ago. The Unicode Technical Committee once turned down emoji, then a few years later changed its mind. So given a fair chance to present my ideas and a fair assessment, the idea could become accepted.



>> There is as far as I am aware no premise or presumption when sending any email message that a font will get transported with the message…
> So, what I hear you now saying is that your proposal for localizable sentences does not need your proposed ‘text’ table. Then what is the point of the proposed ‘text’ table?

I am not saying that at all. I said that the font does not get transported with the message. The font would be resident in the receiving equipment ready to be used in converting the incoming binary encoded data of the email to the glyphs for the local display in the language for which the receiving equipment is set up.

>> The 'text' table would have far wider application that just localizable sentences.
> No usage scenario has been presented suggesting a ‘text’ table would be useful in text-display implementations.

Well, there was a call for ideas and I have put an idea forward. There could be applications in text to speech for emoji and other symbols.



By the way, an administrative note. There seems to be something peculiar happening with the emails that I am sending to the list. Twice now an email has not got into the archive nor me receiving a copy. After a while I post it again and suddenly two copies appear in the archive and arrive here. Quite peculiar.



William Overington



Wednesday 12 May 2021


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20210512/237067d1/attachment-0001.html>


More information about the mpeg-otspec mailing list