Home |
Search |
Today's Posts |
#1
![]()
Posted to microsoft.public.word.docmanagement,microsoft.public.word.programming
|
|||
|
|||
![]()
I'm looking for an Open Packaging Format creator/manager for OS X.
Specfically, I'm trying to generate dynamic .docx files, which is going pretty well, but I'm not able to repackage the edited files without blowing up Word... any rel(s), image, or XML file in the package that I change (other than the document.xml itself) is seen as corrupt by Word when it tries to open the file. Sure enough, Word is clever enough to recover the remaining elements, but anything I change gets nuked. That prevents me from swapping in new images (charts in my case), or modifying any hyperlinks (which exist in .rel files). I found that I can open up a .docx file directly in Stuffit Archive Manager (SAM) (without even having to change the extension), which eliminates the need to re-zip the .docx files from scratch (which seems to blow my entire document when I try). Using SAM, I can extract only the file I choose, edit it, put it back, and the other elements remain intact. The key problem appears to be the compression technique: the docx isn't actually a Plain Old Zip (POZ) file, it's actually an Open Packaging Convention (OPC) file: http://en.wikipedia.org/wiki/Open_Packaging_Conventions Now, if I could only find an OPC creator/manager for the Mac... a GUI would be great, but a command line would do as well. I seemed to have found what at first appeared such a tool, but it doesn't seem to do anything except manage MacPorts: http://www.versiontracker.com/dyn/moreinfo/macosx/32608 FYI, Porticus needs MacPorts installed as well: http://www.macports.org/install.php However, I can't seem to see how Porticus helps me with the OPC file management... I'm probably on a wild turkey chase with that, but there might be something there. I'm guessing that there are Mac developers here more informed than me, hoping someone can shed light on my OPC requirements. thanks in advance, folks. |
#2
![]()
Posted to microsoft.public.word.docmanagement,microsoft.public.word.programming
|
|||
|
|||
![]() "Phantom" wrote in message news:[email protected]... I'm looking for an Open Packaging Format creator/manager for OS X. Specfically, I'm trying to generate dynamic .docx files, which is going pretty well, but I'm not able to repackage the edited files without blowing up Word... any rel(s), image, or XML file in the package that I change (other than the document.xml itself) is seen as corrupt by Word when it tries to open the file. Sure enough, Word is clever enough to recover the remaining elements, but anything I change gets nuked. What happens if you open one of the XML files inside the container and then save it without changing anything? Does that corrupt the document as well? If so, extract the XML file from both the correct and corrupt version and do a byte-by-byte comparison. It could be that the editor you use adds a byte order mark (BOM) at the start of an XML file. That prevents me from swapping in new images (charts in my case), or modifying any hyperlinks (which exist in .rel files). I found that I can open up a .docx file directly in Stuffit Archive Manager (SAM) (without even having to change the extension), which eliminates the need to re-zip the .docx files from scratch (which seems to blow my entire document when I try). Using SAM, I can extract only the file I choose, edit it, put it back, and the other elements remain intact. The key problem appears to be the compression technique: the docx isn't actually a Plain Old Zip (POZ) file, it's actually an Open Packaging Convention (OPC) file: http://en.wikipedia.org/wiki/Open_Packaging_Conventions From a compression point of view, there isn't any difference between the two. POZ and OPC are the same format. The difference is that a POZ file has no notion of its contents while an OPC file uses standardized entrypoints to find the relation between the different files in the container. Now, if I could only find an OPC creator/manager for the Mac... a GUI would be great, but a command line would do as well. I seemed to have found what at first appeared such a tool, but it doesn't seem to do anything except manage MacPorts: http://www.versiontracker.com/dyn/moreinfo/macosx/32608 FYI, Porticus needs MacPorts installed as well: http://www.macports.org/install.php However, I can't seem to see how Porticus helps me with the OPC file management... I'm probably on a wild turkey chase with that, but there might be something there. I'm guessing that there are Mac developers here more informed than me, hoping someone can shed light on my OPC requirements. thanks in advance, folks. Mono implements the System.IO.Packaging namespace for .Net on MacOS (not sure if it is in the latest release already). That namespace contains an API for accessing an manipulating OPC files. I know the API works as I have written apps already which use it (I'm not a Mac specialist, I just needed to make some software cross-platform). Of course, this means you would still have to program your application yourself. Yves |
#3
![]()
Posted to microsoft.public.word.docmanagement,microsoft.public.word.programming
|
|||
|
|||
![]()
On 2009-10-11 12:26:01 -0700, "Yves Dhondt" said:
"Phantom" wrote in message news:[email protected]... I'm looking for an Open Packaging Format creator/manager for OS X. Specfically, I'm trying to generate dynamic .docx files, which is going pretty well, but I'm not able to repackage the edited files without blowing up Word... any rel(s), image, or XML file in the package that I change (other than the document.xml itself) is seen as corrupt by Word when it tries to open the file. Sure enough, Word is clever enough to recover the remaining elements, but anything I change gets nuked. What happens if you open one of the XML files inside the container and then save it without changing anything? Does that corrupt the document as well? If so, extract the XML file from both the correct and corrupt version and do a byte-by-byte comparison. It could be that the editor you use adds a byte order mark (BOM) at the start of an XML file. no, only when something is added to the file (without its proper meta infomation, I'm guessing). That prevents me from swapping in new images (charts in my case), or modifying any hyperlinks (which exist in .rel files). I found that I can open up a .docx file directly in Stuffit Archive Manager (SAM) (without even having to change the extension), which eliminates the need to re-zip the .docx files from scratch (which seems to blow my entire document when I try). Using SAM, I can extract only the file I choose, edit it, put it back, and the other elements remain intact. The key problem appears to be the compression technique: the docx isn't actually a Plain Old Zip (POZ) file, it's actually an Open Packaging Convention (OPC) file: http://en.wikipedia.org/wiki/Open_Packaging_Conventions From a compression point of view, there isn't any difference between the two. POZ and OPC are the same format. The difference is that a POZ file has no notion of its contents while an OPC file uses standardized entrypoints to find the relation between the different files in the container. Now, if I could only find an OPC creator/manager for the Mac... a GUI would be great, but a command line would do as well. I seemed to have found what at first appeared such a tool, but it doesn't seem to do anything except manage MacPorts: http://www.versiontracker.com/dyn/moreinfo/macosx/32608 FYI, Porticus needs MacPorts installed as well: http://www.macports.org/install.php However, I can't seem to see how Porticus helps me with the OPC file management... I'm probably on a wild turkey chase with that, but there might be something there. I'm guessing that there are Mac developers here more informed than me, hoping someone can shed light on my OPC requirements. thanks in advance, folks. Mono implements the System.IO.Packaging namespace for .Net on MacOS (not sure if it is in the latest release already). That namespace contains an API for accessing an manipulating OPC files. I know the API works as I have written apps already which use it (I'm not a Mac specialist, I just needed to make some software cross-platform). Of course, this means you would still have to program your application yourself. Yves ..net isn't going to help me much on OS X... this is supposed to be an open standard, so it should go without saying that I shouldn't have to use a MS product to manage the document. to Microsoft's credit, the document is pretty well formed, and they're 95% of the way there... I just can't create the dang OPC file. |
#4
![]()
Posted to microsoft.public.word.docmanagement,microsoft.public.word.programming
|
|||
|
|||
![]() "Phantom" wrote in message news:[email protected]... On 2009-10-11 12:26:01 -0700, "Yves Dhondt" said: "Phantom" wrote in message news:[email protected]... I'm looking for an Open Packaging Format creator/manager for OS X. Specfically, I'm trying to generate dynamic .docx files, which is going pretty well, but I'm not able to repackage the edited files without blowing up Word... any rel(s), image, or XML file in the package that I change (other than the document.xml itself) is seen as corrupt by Word when it tries to open the file. Sure enough, Word is clever enough to recover the remaining elements, but anything I change gets nuked. What happens if you open one of the XML files inside the container and then save it without changing anything? Does that corrupt the document as well? If so, extract the XML file from both the correct and corrupt version and do a byte-by-byte comparison. It could be that the editor you use adds a byte order mark (BOM) at the start of an XML file. no, only when something is added to the file (without its proper meta infomation, I'm guessing). That prevents me from swapping in new images (charts in my case), or modifying any hyperlinks (which exist in .rel files). I found that I can open up a .docx file directly in Stuffit Archive Manager (SAM) (without even having to change the extension), which eliminates the need to re-zip the .docx files from scratch (which seems to blow my entire document when I try). Using SAM, I can extract only the file I choose, edit it, put it back, and the other elements remain intact. The key problem appears to be the compression technique: the docx isn't actually a Plain Old Zip (POZ) file, it's actually an Open Packaging Convention (OPC) file: http://en.wikipedia.org/wiki/Open_Packaging_Conventions From a compression point of view, there isn't any difference between the two. POZ and OPC are the same format. The difference is that a POZ file has no notion of its contents while an OPC file uses standardized entrypoints to find the relation between the different files in the container. Now, if I could only find an OPC creator/manager for the Mac... a GUI would be great, but a command line would do as well. I seemed to have found what at first appeared such a tool, but it doesn't seem to do anything except manage MacPorts: http://www.versiontracker.com/dyn/moreinfo/macosx/32608 FYI, Porticus needs MacPorts installed as well: http://www.macports.org/install.php However, I can't seem to see how Porticus helps me with the OPC file management... I'm probably on a wild turkey chase with that, but there might be something there. I'm guessing that there are Mac developers here more informed than me, hoping someone can shed light on my OPC requirements. thanks in advance, folks. Mono implements the System.IO.Packaging namespace for .Net on MacOS (not sure if it is in the latest release already). That namespace contains an API for accessing an manipulating OPC files. I know the API works as I have written apps already which use it (I'm not a Mac specialist, I just needed to make some software cross-platform). Of course, this means you would still have to program your application yourself. Yves .net isn't going to help me much on OS X... this is supposed to be an open standard, so it should go without saying that I shouldn't have to use a MS product to manage the document. to Microsoft's credit, the document is pretty well formed, and they're 95% of the way there... I just can't create the dang OPC file. Mono (http://www.mono-project.com/Main_Page) has nothing to do with MS. It's a cross-platform, open-source implementation of the .NET framework. There is also a Java implementation called OpenXML4J (http://sourceforge.net/projects/openxml4j/) but I have no experience with it. You were writing about changing hyperlinks. Do your documents crash if the only thing you do is change the value of the "Target" element of your hyperlink in your document.xml.rels file? Yves |
#5
![]()
Posted to microsoft.public.word.docmanagement,microsoft.public.word.programming
|
|||
|
|||
![]()
On 2009-10-11 12:26:01 -0700, "Yves Dhondt" said:
"Phantom" wrote in message news:[email protected]... I'm looking for an Open Packaging Format creator/manager for OS X. Specfically, I'm trying to generate dynamic .docx files, which is going pretty well, but I'm not able to repackage the edited files without blowing up Word... any rel(s), image, or XML file in the package that I change (other than the document.xml itself) is seen as corrupt by Word when it tries to open the file. Sure enough, Word is clever enough to recover the remaining elements, but anything I change gets nuked. What happens if you open one of the XML files inside the container and then save it without changing anything? Does that corrupt the document as well? If so, extract the XML file from both the correct and corrupt version and do a byte-by-byte comparison. It could be that the editor you use adds a byte order mark (BOM) at the start of an XML file. That prevents me from swapping in new images (charts in my case), or modifying any hyperlinks (which exist in .rel files). I found that I can open up a .docx file directly in Stuffit Archive Manager (SAM) (without even having to change the extension), which eliminates the need to re-zip the .docx files from scratch (which seems to blow my entire document when I try). Using SAM, I can extract only the file I choose, edit it, put it back, and the other elements remain intact. The key problem appears to be the compression technique: the docx isn't actually a Plain Old Zip (POZ) file, it's actually an Open Packaging Convention (OPC) file: http://en.wikipedia.org/wiki/Open_Packaging_Conventions From a compression point of view, there isn't any difference between the two. POZ and OPC are the same format. The difference is that a POZ file has no notion of its contents while an OPC file uses standardized entrypoints to find the relation between the different files in the container. Now, if I could only find an OPC creator/manager for the Mac... a GUI would be great, but a command line would do as well. I seemed to have found what at first appeared such a tool, but it doesn't seem to do anything except manage MacPorts: http://www.versiontracker.com/dyn/moreinfo/macosx/32608 FYI, Porticus needs MacPorts installed as well: http://www.macports.org/install.php However, I can't seem to see how Porticus helps me with the OPC file management... I'm probably on a wild turkey chase with that, but there might be something there. I'm guessing that there are Mac developers here more informed than me, hoping someone can shed light on my OPC requirements. thanks in advance, folks. Mono implements the System.IO.Packaging namespace for .Net on MacOS (not sure if it is in the latest release already). That namespace contains an API for accessing an manipulating OPC files. I know the API works as I have written apps already which use it (I'm not a Mac specialist, I just needed to make some software cross-platform). Of course, this means you would still have to program your application yourself. Yves you're definitely ahead of me on ideas, Yves, thanks for your input so far. I tried out another idea regarding resource forks, but no go. I expanded the .docx file to its juicy component files, then (without changing anything) recompressed them with the command line zip tool, which by all accounts, does not include resource forks: zip -X -r test test .... then renamed test.zip as test.docx and attempted to open it with Word 2008. No luck, Word declares the document bogus. I did attempt a zip -df, but that's long deprecated, and doesn't work. Given that the current command line zip tool doesn't stuff resource forks in the first place, it shouldn't be an issue. Just to make sure, I checked the zip file and didn't see any resource-looking files: % zip -X -r test test adding: test/ (stored 0%) adding: test/[Content_Types].xml (deflated 84%) adding: test/_rels/ (stored 0%) adding: test/_rels/.rels (deflated 66%) adding: test/docProps/ (stored 0%) adding: test/docProps/app.xml (deflated 73%) adding: test/docProps/core.xml (deflated 52%) adding: test/docProps/custom.xml (deflated 60%) adding: test/word/ (stored 0%) adding: test/word/_rels/ (stored 0%) adding: test/word/_rels/document.xml.rels (deflated 85%) adding: test/word/_rels/header2.xml.rels (deflated 38%) adding: test/word/_rels/header3.xml.rels (deflated 38%) adding: test/word/_rels/header4.xml.rels (deflated 38%) adding: test/word/document.xml (deflated 83%) adding: test/word/endnotes.xml (deflated 65%) adding: test/word/fontTable.xml (deflated 85%) adding: test/word/footer1.xml (deflated 65%) adding: test/word/footer2.xml (deflated 79%) adding: test/word/footer3.xml (deflated 81%) adding: test/word/footer4.xml (deflated 81%) adding: test/word/footnotes.xml (deflated 65%) adding: test/word/header1.xml (deflated 70%) adding: test/word/header2.xml (deflated 64%) adding: test/word/header3.xml (deflated 64%) adding: test/word/header4.xml (deflated 64%) adding: test/word/media/ (stored 0%) adding: test/word/media/image1.jpeg (deflated 72%) adding: test/word/media/image2.jpeg (deflated 61%) adding: test/word/numbering.xml (deflated 96%) adding: test/word/settings.xml (deflated 59%) adding: test/word/styles.xml (deflated 89%) adding: test/word/theme/ (stored 0%) adding: test/word/theme/theme1.xml (deflated 79%) adding: test/word/webSettings.xml (deflated 34%) ![]() I also checked your idea about files being changed in the round trip. I extracted one file (A), put it right back in, then took it out again (B). I then used diff to compare A and B, and found them to be identical. So, it definitey seems like the meta data is missing from the OPC. You mentioned "OPC file uses standardized entrypoints to find the relation between the different files in the container"... did you mean that the internal OPC file has entrypoint data that it is tracking internally somewhere within the file, or that the entrypoints are simply standard file names and folder structures that differentiates it from a POZ file? So close, so close... |
#6
![]()
Posted to microsoft.public.word.docmanagement,microsoft.public.word.programming
|
|||
|
|||
![]()
"Phantom" wrote in message
news:[email protected]... On 2009-10-11 12:26:01 -0700, "Yves Dhondt" said: "Phantom" wrote in message news:[email protected]... I'm looking for an Open Packaging Format creator/manager for OS X. Specfically, I'm trying to generate dynamic .docx files, which is going pretty well, but I'm not able to repackage the edited files without blowing up Word... any rel(s), image, or XML file in the package that I change (other than the document.xml itself) is seen as corrupt by Word when it tries to open the file. Sure enough, Word is clever enough to recover the remaining elements, but anything I change gets nuked. What happens if you open one of the XML files inside the container and then save it without changing anything? Does that corrupt the document as well? If so, extract the XML file from both the correct and corrupt version and do a byte-by-byte comparison. It could be that the editor you use adds a byte order mark (BOM) at the start of an XML file. That prevents me from swapping in new images (charts in my case), or modifying any hyperlinks (which exist in .rel files). I found that I can open up a .docx file directly in Stuffit Archive Manager (SAM) (without even having to change the extension), which eliminates the need to re-zip the .docx files from scratch (which seems to blow my entire document when I try). Using SAM, I can extract only the file I choose, edit it, put it back, and the other elements remain intact. The key problem appears to be the compression technique: the docx isn't actually a Plain Old Zip (POZ) file, it's actually an Open Packaging Convention (OPC) file: http://en.wikipedia.org/wiki/Open_Packaging_Conventions From a compression point of view, there isn't any difference between the two. POZ and OPC are the same format. The difference is that a POZ file has no notion of its contents while an OPC file uses standardized entrypoints to find the relation between the different files in the container. Now, if I could only find an OPC creator/manager for the Mac... a GUI would be great, but a command line would do as well. I seemed to have found what at first appeared such a tool, but it doesn't seem to do anything except manage MacPorts: http://www.versiontracker.com/dyn/moreinfo/macosx/32608 FYI, Porticus needs MacPorts installed as well: http://www.macports.org/install.php However, I can't seem to see how Porticus helps me with the OPC file management... I'm probably on a wild turkey chase with that, but there might be something there. I'm guessing that there are Mac developers here more informed than me, hoping someone can shed light on my OPC requirements. thanks in advance, folks. Mono implements the System.IO.Packaging namespace for .Net on MacOS (not sure if it is in the latest release already). That namespace contains an API for accessing an manipulating OPC files. I know the API works as I have written apps already which use it (I'm not a Mac specialist, I just needed to make some software cross-platform). Of course, this means you would still have to program your application yourself. Yves you're definitely ahead of me on ideas, Yves, thanks for your input so far. I tried out another idea regarding resource forks, but no go. I expanded the .docx file to its juicy component files, then (without changing anything) recompressed them with the command line zip tool, which by all accounts, does not include resource forks: zip -X -r test test ... then renamed test.zip as test.docx and attempted to open it with Word 2008. No luck, Word declares the document bogus. I did attempt a zip -df, but that's long deprecated, and doesn't work. Given that the current command line zip tool doesn't stuff resource forks in the first place, it shouldn't be an issue. Just to make sure, I checked the zip file and didn't see any resource-looking files: % zip -X -r test test adding: test/ (stored 0%) adding: test/[Content_Types].xml (deflated 84%) adding: test/_rels/ (stored 0%) adding: test/_rels/.rels (deflated 66%) adding: test/docProps/ (stored 0%) adding: test/docProps/app.xml (deflated 73%) adding: test/docProps/core.xml (deflated 52%) adding: test/docProps/custom.xml (deflated 60%) adding: test/word/ (stored 0%) adding: test/word/_rels/ (stored 0%) adding: test/word/_rels/document.xml.rels (deflated 85%) adding: test/word/_rels/header2.xml.rels (deflated 38%) adding: test/word/_rels/header3.xml.rels (deflated 38%) adding: test/word/_rels/header4.xml.rels (deflated 38%) adding: test/word/document.xml (deflated 83%) adding: test/word/endnotes.xml (deflated 65%) adding: test/word/fontTable.xml (deflated 85%) adding: test/word/footer1.xml (deflated 65%) adding: test/word/footer2.xml (deflated 79%) adding: test/word/footer3.xml (deflated 81%) adding: test/word/footer4.xml (deflated 81%) adding: test/word/footnotes.xml (deflated 65%) adding: test/word/header1.xml (deflated 70%) adding: test/word/header2.xml (deflated 64%) adding: test/word/header3.xml (deflated 64%) adding: test/word/header4.xml (deflated 64%) adding: test/word/media/ (stored 0%) adding: test/word/media/image1.jpeg (deflated 72%) adding: test/word/media/image2.jpeg (deflated 61%) adding: test/word/numbering.xml (deflated 96%) adding: test/word/settings.xml (deflated 59%) adding: test/word/styles.xml (deflated 89%) adding: test/word/theme/ (stored 0%) adding: test/word/theme/theme1.xml (deflated 79%) adding: test/word/webSettings.xml (deflated 34%) ![]() I also checked your idea about files being changed in the round trip. I extracted one file (A), put it right back in, then took it out again (B). I then used diff to compare A and B, and found them to be identical. So, it definitey seems like the meta data is missing from the OPC. You mentioned "OPC file uses standardized entrypoints to find the relation between the different files in the container"... did you mean that the internal OPC file has entrypoint data that it is tracking internally somewhere within the file, or that the entrypoints are simply standard file names and folder structures that differentiates it from a POZ file? So close, so close... It looks for [Content_Types].xml in the root directory of your zip file. That file contains links to the different parts in your document. Without that file, Word (or any OPC capable tool) can do a thing. When you extract your document, you extract it to a subfolder you create somewhere. When you recompress your document, you should compress the CONTENT of that subfolder, not that subfolder. The output of your compression algorithm makes it look as if you compress the folder "test" while you should be compressing the content of the folder "test", not the folder itself. So you should move into the folder and run your compression command from there. Yves |
#8
![]() |
|||
|
|||
![]()
The information above is very good to me, thanks for sharing! Five Nights at Freddy's
|
Reply |
Thread Tools | |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Import bibliography in source manager XML format | Microsoft Word Help | |||
WORD.EXE open multiple times in task manager | Microsoft Word Help | |||
WORD.EXE open multiple times in task manager | Microsoft Word Help | |||
I want all docs to open at 115% zoom, not what the creator set . | Microsoft Word Help | |||
Looking for Resident Manager Job Description ideas/format | Page Layout |