View Single Post
  #6   Report Post  
Roy Schestowitz
 
Posts: n/a
Default

__/ [Alan J. Flavell] on Sunday 11 September 2005 11:19 \__

On Sun, 11 Sep 2005, SpaceGirl wrote:

Alan J. Flavell wrote:


[comprehensive quote of my posting, without apparently having anything
relevant to say about it.]

Word XP and upwards stores its documents in XML format doesn't it?


So what? XML is only a format for defining markup. If the markup
doesn't do anything meaningful (specifically - if it only creates a
visual result on a printed page, without having any significant
structure) then it's not going to turn into effective HTML: it'd just
be the usual garbage in / garbage out that we're accustomed to with
Word conversions to soi-disant "web" format.

You could probably write your own XSLT to turn in into HTML fairly
easily.


There seems to be some kind of conceptual disconnect here. Most Word
documents (in my experience) simply don't contain the necessary
structure for useful conversion to HTML: they've been created as a
purely visual construction for printing onto paper. It's irrelevant
what underlying technology you use (RTF, XML, SGML, whatever) - the
problem is that the source material simply does not represent the
needed structures, *because the document authors do not put it there*.

You might as well try to convert cheese into fresh cream: both are
fine milk products, it's true, but instead of trying to convert the
one into the other, you'd do better to produce them both starting from
fresh milk. And the kind of "fresh milk" that's needed here is
logically structured text markup. Not visual formatting. Until the
authors of Word documents can grasp that, the prospects for conversion
of Word to web formats are poor, IMHO.


I fully agree with you on that point. Any attempt at rephrasing the same
ideas would result in depletion. To suggest ways forward, I suggest that
the OP, who clearly wants to publish material on the Web, learns LaTeX.
Shall the idea of editing raw text become daunting, I suggest LyX lyx.org
[LyX: Front-end to LaTeX]. 5 minutes with LyX would help anyone realise

the difference and convey the idea, e.g. varying outputs, styles,
imposition of structure, etc.

Only a few days ago, somebody in the LyX mailing lists mentioned his
upcoming presentation on "Word: What you See Is What a Mess". The
presentation I deliver on Wednesday is well-formed XHTML
http://schestowitz.com/Weblog/archiv...blic-speaking/ and is
motored by Eric Meyer's S5.

Roy

--
Roy S. Schestowitz | "Software sucks. Open Source sucks less."
http://Schestowitz.com | SuSE Linux | PGP-Key: 74572E8E
1:45pm up 17 days 12:13, 3 users, load average: 0.51, 0.58, 0.70