[MPEG-OTSPEC] Proposed Adoption of the Online Standards Development Platform
Hin-Tak Leung
htl10 at users.sourceforge.net
Tue Sep 17 20:34:32 CEST 2024
On Tuesday 17 September 2024 at 18:27:41 BST, Werner LEMBERG via mpeg-otspec <mpeg-otspec at lists.aau.at> wrote:
> >> I have not yet seen a single XML document processed with the
> >> standard XML tools to create a PDF file that really looks good from
> >> a typographical point of view.
> >
> > That's possibly because the good ones generally don't mention how
> > they were produced.
> This might be indeed the case.
> > E.g. a good proportion of best-selling fiction is done from XML
> > using CSS for print and AntennaHouse Formatter or PDFReactor or
> > RenderX or PrinceXML, as are a good many scientific and technical
> > journal articles.
> Well, the question how much manual editing was necessary after
> converting the input XML...
> > Note that since Word and Open/Libre Office both use XML, the
> > document is ALREADY being produced from XML.
> I was imprecise, sorry. There is a big difference between using XML
> as a document storage format, with zillions of proprietary and/or
> program-specific extensions, and using XML for document interchange.
> I meant the latter.
RenderX is proprietary, I think; but I think it was considered one of the best (or only?), 2 decades ago, since I last looked. Apache FOP was just beginning 20 years ago but I remember it was quite okay - FOP stands for "formatting object processor" and from the look of it looks quite like description of pdf's data structure in XML syntax (and a special name space). FO (the namespace) is probably a w3c standard now, and FOP should be somewhat better in the last 20 years?
Granted ,
XML (some general namespace) -> (XSLT) -> FO -> (FOP / RenderX) -> PDF
Is going to depend a bit on how clever the XSLT (specific to this particular xml format) is, and how clever FOP / RenderX does things. When I looked at FOP 20 years ago, it couldn't do automatic hyphenations (LaTeX style, sorry I am a LaTeX fan...) and spacing algorithms etc were a bit lacking, but it must have improved by now.
Although, strictly speaking, automatic hyphenation LaTeX babel style is probably a bit against the XML philosophy of preserving the content....
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.aau.at/pipermail/mpeg-otspec/attachments/20240917/2e0971b7/attachment-0001.htm>
More information about the mpeg-otspec
mailing list