Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
XML word Newbie
I m familiar with MS WORD. I found that XML can work on using MS word. Is
this true. In this case. I have a page document which has line breaks, spaces, bullet points and etc. I was wondering if this document can be converted into xml tages ie add appropriate xml tages whereever there is a line break, bullet list etc If MS word doesnt do that then can somone explain what XMl is doing in Word -- I believe in Hope. DesigningSally |
#2
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
XML word Newbie
XML? Sounds easy, but XML is a general-purpose markup standard and a
family of facilities that work with it. Of itself, XML does nothing, so it's important for you to know what you are trying to achieve. Word 2003 and later (and Mac Word 2008) have a number of facilities for working with XML. In essence, a. you can save and open documents in the Word 2003 and Word 2007 WordML/WordProcessingML formats b. you can work with what Microsoft calls "Custom XML", in essence enabling you to define an XML schema and use Word as an editor for capturing data that conforms to that schema c. some other versions of Word on both Windows and Mac can open/save in some of those (a) formats using converters that you can download from Microsoft sites. Roughly speaking, (a) and (b) have very little to do with each other, and from what you say, I'd say that (a) is likely to be more useful to you than (b). Perhaps you are looking for some way to tag your Word text according to some existing XML standard/schema such as docbook (see http://www.docbook.org ) The good news is that that should be possible because in theory you can transform any piece of XML into any other (within reason) using XSL transforms. The bad news is that a. XSL usually has quite a steep learning curve (unless you just "get it") b. WordML has a lot of stuff in it that you may find particularly tricky yo handle using XSL. WordML has to be able to encode everything that Word can do, which means it has to encode a mass of style and formatting information, and it has to do it in a way that does not break XML syntax rules. So Word puts quite a lot of stuff into its XML files that does not correspond to anything you see on screen or even in the Word Object model. So the XML that represents what you may think of as a simple bulleted paragraph may actually be quite complex, and may also be in several parts of the document So getting what /you/ want out of WordML may be non-trivial. There may well be sites with lots of useful XSL code that will help: I don't know. But if that's what you want to do, it's worth bearing in mind that Word (2003/2007 anyway) can save in several different XML formats: a. .docx/docm. These are actually .zip format files with a number of WordML .xml files zipped up inside them, so before you can do any XSL processing on them you have to locate and extract the relevant .xml b. .xml - the Word 2007 format .xml files actually contain most, but not all, of the same WordML stuff that .docx files contain, but in a single file format that you could in theory feed straight to an XSL processor c. .odt files. These use the OpenDocument XML standard that is also used by OpenOffice, not WordML. Word 2007 SP2 can save in this format natively. Off the top of my head, I don't know whether the external converter package that other versions of Word use can do it. These files are also .zip files with .xml inside, and they do not necessarily encode everything that .docx can encode, but they may actually be more useful as a way of getting what you want, because - they are simpler - I suspect there are more tools out there capable of transforming .odt Peter Jamieson http://tips.pjmsn.me.uk Visit Londinium at http://www.ralphwatson.tv Designingsally wrote: I m familiar with MS WORD. I found that XML can work on using MS word. Is this true. In this case. I have a page document which has line breaks, spaces, bullet points and etc. I was wondering if this document can be converted into xml tages ie add appropriate xml tages whereever there is a line break, bullet list etc If MS word doesnt do that then can somone explain what XMl is doing in Word |
Reply |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Word Newbie - needs help with links/references - I think... | Microsoft Word Help | |||
Newbie to Word Please help | New Users | |||
[newbie] word counts.. and excluding stuff.. | Microsoft Word Help | |||
Newbie: Linked document failures in Word XP | New Users | |||
newbie question; working with Word | Microsoft Word Help |