[MPEG-OTSPEC] Could software used for GSUB decoding be adapted to decode localizable sentence codes please?

wjgo_10009 at btinternet.com wjgo_10009 at btinternet.com
Sat Apr 4 21:08:43 CEST 2020


> I fail to see the relevance of this to fonts.

It does not relate to fonts. Yet it does relate to OpenType in that for 
the GSUB table in an OpenType font to be applied in a practical 
situation, there needs to be a software application that can detect 
examples in the text stream of sequences that match the sequences 
defined in the font.

> We have existing standards to identify languages, not least the ISO 
> 639 series, and the IETF BCP-47 (which uses the ISO specification). 
> The IETF specification is capable of being extremely specific if 
> desired — fthe document gives the example de-CH-1901 (German as used 
> in Switzerland using the 1901 variant [orthography]).

Yes. one of those codes can be used in a comment in a sentence.dat file 
so as to provide feedback to a human being of the target language of the 
particular sentence.dat file.

Until reading your post I was unaware of the extremely specific 
capability of those standards.

> Anyone is welcome to build translation systems, or libraries of 
> pre-defined translations, and I would recommend that they use these 
> well-thought-out tagging systems — but what has that got to do with 
> fonts?

Well, my sentence.dat format does so.

The format dates from around 2014 but in reviewing the published 
documents I realized that they differ from some recent ideas of mine 
about the code numbers for then sentences, so I have today produced and 
published a new document.

http://www.users.globalnet.co.uk/~ngo/The_Format_of_the_sentence_dot_dat_files_for_use_in_Research_on_Communication_through_the_Language_Barrier_using_encoded_Localizable_Sentences.pdf

http://www.users.globalnet.co.uk/~ngo/localizable_sentences_research.htm

Hopefully that new document helps to explain.

As it happens I have since 2016 been writing some novels built around 
the idea of localizable sentences. Yes, it is sort of science fiction, 
but the intention is to convey my ideas in a popularly readable format. 
After I had completed the novel in 2019 I missed writing it, so I 
started a second novel. Free to read, no registration requested.

Well, I am not a novelist in the sense of doing it professionally, and I 
appreciate that writing novels like that around an invention might raise 
some eyebrows, but I like to think that they put my ideas across 
effectively, and they have helped me express ideas, and, well, if they 
help keep my mind active as I get older, that is good.

Two chapters from the second novel might be helpful here

http://www.users.globalnet.co.uk/~ngo/localizable_sentences_the_second_novel_chapter_009.pdf

quote from the chapter

John continues. “So I am going to describe the three ways that are 
proposed in our research, describing each by an example encoding, in 
each case for the original sentence that I mentioned, namely ‘Good day.’

end quote

http://www.users.globalnet.co.uk/~ngo/localizable_sentences_the_second_novel_chapter_027.pdf

quote from the chapter

“Well, I don’t know what it is about but someone was asking if decoding 
localizable sentences had a sort of ffi problem and someone say ‘no’. I 
was wondering what that is about please.”

end quote

Readers might like to look at the part about the possible encoding of 
localizable sentences in Unicode in one of my replies to the Public 
Review of the QID Emoji proposal.

https://www.unicode.org/review/pri408/

Links to the novels.

http://www.users.globalnet.co.uk/~ngo/novel_plus.htm

http://www.users.globalnet.co.uk/~ngo/locse_novel2.htm

http://www.users.globalnet.co.uk/~ngo/

The webspace is hosted on a server run by PlusNet PLC, a United Kingdom 
Internet Service Provider. The webspace is not hosted on my computer.

I am hoping that the ideas can result in an ISO standard and that the 
invention can be applied in practical use on computers and mobile 
devices. My view is that a free-to-use non-proprietary ISO standard is 
the way to achieve this, then the invention could be used integrated 
into Unicode plain text usage on various platforms interoperably.

William Overington

Saturday 4 April 2020



More information about the mpeg-otspec mailing list