Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
.docx files have XML components, but what's their use?
I read that if any corruption occurs, slim chances of recovering for 2003
version files. In 2007 you can recover almost fully because the actual file is in zip format and inside it contains many xml files. But the "file" as such, .docx is a single file (until unzipped & extracted). Then how can some corruption save the file, because even in a zip format file, if a small chunk is gone, you can never open it. Could anyone shed some light on this? Thanks |
#2
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
.docx files have XML components, but what's their use?
There could well be (and certainly are) cases where the corruption does not
preclude the Zip file from being opened. -- Hope this helps. Please reply to the newsgroup unless you wish to avail yourself of my services on a paid consulting basis. Doug Robbins - Word MVP, originally posted via msnews.microsoft.com "Ghitorni" wrote in message ... I read that if any corruption occurs, slim chances of recovering for 2003 version files. In 2007 you can recover almost fully because the actual file is in zip format and inside it contains many xml files. But the "file" as such, .docx is a single file (until unzipped & extracted). Then how can some corruption save the file, because even in a zip format file, if a small chunk is gone, you can never open it. Could anyone shed some light on this? Thanks |
#3
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
.docx files have XML components, but what's their use?
There could well be (and certainly are) cases where the corruption does not
preclude the Zip file from being opened. -- Hope this helps. Please reply to the newsgroup unless you wish to avail yourself of my services on a paid consulting basis. Doug Robbins - Word MVP, originally posted via msnews.microsoft.com "Ghitorni" wrote in message ... I read that if any corruption occurs, slim chances of recovering for 2003 version files. In 2007 you can recover almost fully because the actual file is in zip format and inside it contains many xml files. But the "file" as such, .docx is a single file (until unzipped & extracted). Then how can some corruption save the file, because even in a zip format file, if a small chunk is gone, you can never open it. Could anyone shed some light on this? Thanks |
#4
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
.docx files have XML components, but what's their use?
..docx and .doc files (at least since about Word 6) have a more similar
structure than many people probably realise - even in .doc, which uses OLE Compound Files, the content is divided into different "streams" which can be opened separately. That said, .docx does have considerable advantages, including a. the ZIP file structure itself is a de facto standard - I don't personally have any ZIP utilities for recovering "unopenable" ZIP files, but I expect there are many. I don't think you will find so many utilities that know how to recover the content of a corrupted OLE Compound File b. each file within the ZIP is almost certainly going to be an XML text file such as "document.xml", or a single binary object such as a ..jpg. If the ZIP is damaged, but you can still open it and get the document.xml, you have already achieved quite a lot. Even if the ZIP is damaged to the extent that you cannot open it, a recovery utility has a much better chance of identifying the component files when it knows that they are either XML or - in some cases at least - well-known types of binary object such as .jpg. In contrast, in a .doc, the equivalent of document.xml is a complex binary structure. It isn't even a simple stream of text with markup. You have to have a utility that knows precisely how to look through that binary representation in order to extract anything at all. Although MS has now published the .doc standard (it appears to be a work in progress), I suspect not many people will want to spend resource developing new recovery software for obsolescent formats. Peter Jamieson http://tips.pjmsn.me.uk On 05/05/2010 09:10, Ghitorni wrote: I read that if any corruption occurs, slim chances of recovering for 2003 version files. In 2007 you can recover almost fully because the actual file is in zip format and inside it contains many xml files. But the "file" as such, .docx is a single file (until unzipped & extracted). Then how can some corruption save the file, because even in a zip format file, if a small chunk is gone, you can never open it. Could anyone shed some light on this? Thanks |
#5
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
.docx files have XML components, but what's their use?
..docx and .doc files (at least since about Word 6) have a more similar
structure than many people probably realise - even in .doc, which uses OLE Compound Files, the content is divided into different "streams" which can be opened separately. That said, .docx does have considerable advantages, including a. the ZIP file structure itself is a de facto standard - I don't personally have any ZIP utilities for recovering "unopenable" ZIP files, but I expect there are many. I don't think you will find so many utilities that know how to recover the content of a corrupted OLE Compound File b. each file within the ZIP is almost certainly going to be an XML text file such as "document.xml", or a single binary object such as a ..jpg. If the ZIP is damaged, but you can still open it and get the document.xml, you have already achieved quite a lot. Even if the ZIP is damaged to the extent that you cannot open it, a recovery utility has a much better chance of identifying the component files when it knows that they are either XML or - in some cases at least - well-known types of binary object such as .jpg. In contrast, in a .doc, the equivalent of document.xml is a complex binary structure. It isn't even a simple stream of text with markup. You have to have a utility that knows precisely how to look through that binary representation in order to extract anything at all. Although MS has now published the .doc standard (it appears to be a work in progress), I suspect not many people will want to spend resource developing new recovery software for obsolescent formats. Peter Jamieson http://tips.pjmsn.me.uk On 05/05/2010 09:10, Ghitorni wrote: I read that if any corruption occurs, slim chances of recovering for 2003 version files. In 2007 you can recover almost fully because the actual file is in zip format and inside it contains many xml files. But the "file" as such, .docx is a single file (until unzipped & extracted). Then how can some corruption save the file, because even in a zip format file, if a small chunk is gone, you can never open it. Could anyone shed some light on this? Thanks |
Reply |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
DOCX files | New Users | |||
How to see files in .docx? | Microsoft Word Help | |||
How do I get old hyperlinks to look for .docx files? | Microsoft Word Help | |||
docx files | Microsoft Word Help | |||
Why doesn't Word 2007 insert scanned TIF files, or DOCX files? | Microsoft Word Help |