Home |
Search |
Today's Posts |
#1
|
|||
|
|||
ms-word creates html dash problem for front page
greetings from canada!!
i am using ms-word to transform/create a word-mode file into an html-mode file (about a manual of about 350+ pages)..... (using the "save as .html" function). background to the application: the html file has been run through the html filter to eliminate the ms-word superflous coding that would normally enable "full circle" editing using ms-word.... we do not need that ms-word coding as the html file will not be going back to ms-word - and although there maybe some minor editing by ms-frontpage, the file will then be passed onward to a custom program that automatically generates thousands of hyperlinks as well as a permuted cross-reference from the content itself.... my problem: regardless of whether the html filter is used or not, the html output from ms-word contains a small flaw: on some blank lines throughout the html-mode file, there is a "_" (i.e. looks like an "underscore" character) in column 1 -- my request for assistance: i would like to know how to completely prevent this character from ever occurring so that we might alter our word processing text entry rules to avoid its presence in the html-mode file.... any suggestions will be very much appreciated.... jack bonney |
#2
|
|||
|
|||
----- Original Message -----
From: "jack w. bonney vancouver" Newsgroups: microsoft.public.word.docmanagement Sent: Thursday, July 07, 2005 11:01 PM Subject: ms-word creates html dash problem for front page greetings from canada!! i am using ms-word to transform/create a word-mode file into an html-mode file (about a manual of about 350+ pages)..... (using the "save as .html" function). background to the application: the html file has been run through the html filter to eliminate the ms-word superflous coding that would normally enable "full circle" editing using ms-word.... we do not need that ms-word coding as the html file will not be going back to ms-word - and although there maybe some minor editing by ms-frontpage, the file will then be passed onward to a custom program that automatically generates thousands of hyperlinks as well as a permuted cross-reference from the content itself.... my problem: regardless of whether the html filter is used or not, the html output from ms-word contains a small flaw: on some blank lines throughout the html-mode file, there is a "_" (i.e. looks like an "underscore" character) in column 1 -- my request for assistance: i would like to know how to completely prevent this character from ever occurring so that we might alter our word processing text entry rules to avoid its presence in the html-mode file.... any suggestions will be very much appreciated.... jack bonney Hello Jack, Ever hear of Patterson Park ;-))) Your first three paragraphs are in terrible conflict. It's impossible to create a html page from Word that does not include both some invalid and deprecated html. ( A simple example is in bolding fonts. b/b) You haven't specified which version of Word your using which may assist the MVP's? I'm assuming that you started from scratch and created your document in Word with both font and page layout formatting? Perhaps even some other Word goodies added in? My suggestion is to take your entire document of approx 350 pages and save it as a STRAIGHT-TEXT file with not a solitary piece of formatting or layout included and create your html pages from that text. (It's the only was to avoid such simple errors as you currently face.) The older versions of FrontPage were no better when using fonts or components than Word. FP also used deprecated html. MS has said that 2003 FP (Standalone) would be better, however I've heard from others that's not so. In your second last paragraph you provide the following: " there is a '_' (i.e. looks like an 'underscore' character) in column 1 -- " There is conflict here as well? Is it an underscore or a double-hyphen and is it surrounded by quotes or have you just used the quotes for emphasis? I'm inclined to believe that these mystery characters are weaknesses in the html cleaner. Nor do I believe you'll find a setting in Word which will control the use of these characters (as least as related to the html component.) If the quotes are there? A search and replace with any text editor would be very easy to do. If the underscore is the mystery character than it's possibly related to the image links that Word creates. Before composing this reply to you? I used word to create my only ever, 2nd html page with Word. I have numerous settings in Word (as related to html) turned off as I dug for solutions for others. I copied a complex CSS-HTML page into Word and then saved as html (with the aforementioned settings.) I do not have the html cleaner for Word2000 installed as I never intend to use Word to create web pages. The result of the copy and save as related to the viewing of the html afterward was a far cry from being anywhere in the same continent as clean html. |
#3
|
|||
|
|||
Hi Jack,
To add to the reply from lostinspace, without knowing the content of the Word .doc file, a snippet of the HTML that would show the 'mystery' character and the version of Word you're using it's not easy to have an idea on where to focus attention. If you're using the Office 2000 HTML filter, you'll usually find that using the standalone MSFilter.exe tool (Start=Programs=Microsoft Office Tools as well) will give you more options for stripping out the information than if you're using File=Export to HTML from inside of Word. This article has additional information on what the filter can remove. http://office.microsoft.com/en-us/as...549981033.aspx ======== "jack w. bonney vancouver" wrote in message ... greetings from canada!! i am using ms-word to transform/create a word-mode file into an html-mode file (about a manual of about 350+ pages)..... (using the "save as .html" function). background to the application: the html file has been run through the html filter to eliminate the ms-word superflous coding that would normally enable "full circle" editing using ms-word.... we do not need that ms-word coding as the html file will not be going back to ms-word - and although there maybe some minor editing by ms-frontpage, the file will then be passed onward to a custom program that automatically generates thousands of hyperlinks as well as a permuted cross-reference from the content itself.... my problem: regardless of whether the html filter is used or not, the html output from ms-word contains a small flaw: on some blank lines throughout the html-mode file, there is a "_" (i.e. looks like an "underscore" character) in column 1 -- my request for assistance: i would like to know how to completely prevent this character from ever occurring so that we might alter our word processing text entry rules to avoid its presence in the html-mode file.... any suggestions will be very much appreciated.... jack bonney -- Let us know if this helped you, Bob Buckland ?:-) MS Office System Products MVP *Courtesy is not expensive and can pay big dividends* For Everyday MS Office tips to "use right away" - http://microsoft.com/events/series/a...andtricks.mspx |
#4
|
|||
|
|||
lostinspace shared this with us in microsoft.public.word.docmanagement:
My suggestion is to take your entire document of approx 350 pages and save it as a STRAIGHT-TEXT file with not a solitary piece of formatting or layout included and create your html pages from that text. (It's the only was to avoid such simple errors as you currently face.) This also works: * Open your document in Word. * Copy everything to the clipboard: CTRL+A CTRL+C * Open Dreamweaver (I don't know Frontpage, never used it) * Paste from clipboard: CTRL+V This method preserves most of the formatting AND gives you better HTML. Dreamweaver 2004 is even supposed to be HTML 4.01 and XHTML compliant. -- Amedee Van Gasse |
#5
|
|||
|
|||
amadee - thank you for your fast response to my question.... patterson park
is the trotting track in ladner, right?? unfortunately, at the moment, it is not a perfect world out there!! the original document follows a typical pattern of evolution: it was created by collaboration between 4 amateurs using word 3.1 (1995) with a little bit of wordperfect thrown in for good measure.... in other words, it was a dog's breakfast!! everything was converted/consolidate under word 97 and subsequently migrated up to its current platform of windows xp and ms-word 2000.... anyway, all the word processing must be preserved because there are several hundreds of copies in the field, and ms-word is used to maintain the paper-based version.... (and some management egos are involved, not to mention budget).... at the same time, we want to transform the wordprocessed version to electronic display..... the "mystery character" looks like the underscore character but is actually a bit shorter, and when viewed with frontpage on the split-screen, each character represents a "chunk" of html code, but there is no underscore character in the code string..... therefore, i can not "search and replace".... the object of the exercise is to be able to transform a legacy document over to electronic display as automatically as possible, with only the bare minimum of cosmetic "touch-up" with front page..... i.e keeping the human intervention to an absolute minimum.... i couldn't provide the output image as it would not copy to this panel.... however, here is the html code for three lines with the mystery character that displays in frontpage and ms-explorer: when i highlight the character as it displays in the design panel, only the string is highlighted accordingly in the code panel..... but in fact, nothing is supposed to display because it is intended to be a blank line.... p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p the result is an underscore displayed down the center of the page..... the situation occurs intermittently throughout the document.... thank you in advance for your continued interest in this issue and for any further advice you may be able to offer.... jack. "lostinspace" wrote: ----- Original Message ----- From: "jack w. bonney vancouver" Newsgroups: microsoft.public.word.docmanagement Sent: Thursday, July 07, 2005 11:01 PM Subject: ms-word creates html dash problem for front page greetings from canada!! i am using ms-word to transform/create a word-mode file into an html-mode file (about a manual of about 350+ pages)..... (using the "save as .html" function). background to the application: the html file has been run through the html filter to eliminate the ms-word superflous coding that would normally enable "full circle" editing using ms-word.... we do not need that ms-word coding as the html file will not be going back to ms-word - and although there maybe some minor editing by ms-frontpage, the file will then be passed onward to a custom program that automatically generates thousands of hyperlinks as well as a permuted cross-reference from the content itself.... my problem: regardless of whether the html filter is used or not, the html output from ms-word contains a small flaw: on some blank lines throughout the html-mode file, there is a "_" (i.e. looks like an "underscore" character) in column 1 -- my request for assistance: i would like to know how to completely prevent this character from ever occurring so that we might alter our word processing text entry rules to avoid its presence in the html-mode file.... any suggestions will be very much appreciated.... jack bonney Hello Jack, Ever hear of Patterson Park ;-))) Your first three paragraphs are in terrible conflict. It's impossible to create a html page from Word that does not include both some invalid and deprecated html. ( A simple example is in bolding fonts. b/b) You haven't specified which version of Word your using which may assist the MVP's? I'm assuming that you started from scratch and created your document in Word with both font and page layout formatting? Perhaps even some other Word goodies added in? My suggestion is to take your entire document of approx 350 pages and save it as a STRAIGHT-TEXT file with not a solitary piece of formatting or layout included and create your html pages from that text. (It's the only was to avoid such simple errors as you currently face.) The older versions of FrontPage were no better when using fonts or components than Word. FP also used deprecated html. MS has said that 2003 FP (Standalone) would be better, however I've heard from others that's not so. In your second last paragraph you provide the following: " there is a '_' (i.e. looks like an 'underscore' character) in column 1 -- " There is conflict here as well? Is it an underscore or a double-hyphen and is it surrounded by quotes or have you just used the quotes for emphasis? I'm inclined to believe that these mystery characters are weaknesses in the html cleaner. Nor do I believe you'll find a setting in Word which will control the use of these characters (as least as related to the html component.) If the quotes are there? A search and replace with any text editor would be very easy to do. If the underscore is the mystery character than it's possibly related to the image links that Word creates. Before composing this reply to you? I used word to create my only ever, 2nd html page with Word. I have numerous settings in Word (as related to html) turned off as I dug for solutions for others. I copied a complex CSS-HTML page into Word and then saved as html (with the aforementioned settings.) I do not have the html cleaner for Word2000 installed as I never intend to use Word to create web pages. The result of the copy and save as related to the viewing of the html afterward was a far cry from being anywhere in the same continent as clean html. |
#6
|
|||
|
|||
My suggestion is to take your entire document of approx 350 pages and
save it as a STRAIGHT-TEXT file with not a solitary piece of formatting or layout included and create your html pages from that text. (It's the only was to avoid such simple errors as you currently face.) This also works: * Open your document in Word. * Copy everything to the clipboard: CTRL+A CTRL+C * Open Dreamweaver (I don't know Frontpage, never used it) * Paste from clipboard: CTRL+V This method preserves most of the formatting AND gives you better HTML. Dreamweaver 2004 is even supposed to be HTML 4.01 and XHTML compliant. I've been told this method also works for FrontPage. However, pasting 10 pages in DW caused a pause for me while it showed up--the OP might want to do this in chunks. |
#7
|
|||
|
|||
----- Original Message -----
From: "jack w. bonney vancouver" Newsgroups: microsoft.public.word.docmanagement Sent: Friday, July 08, 2005 10:20 AM Subject: ms-word creates html dash problem for front page & ms-explorer amadee - thank you for your fast response to my question.... patterson park is the trotting track in ladner, right?? unfortunately, at the moment, it is not a perfect world out there!! the original document follows a typical pattern of evolution: it was created by collaboration between 4 amateurs using word 3.1 (1995) with a little bit of wordperfect thrown in for good measure.... in other words, it was a dog's breakfast!! everything was converted/consolidate under word 97 and subsequently migrated up to its current platform of windows xp and ms-word 2000.... anyway, all the word processing must be preserved because there are several hundreds of copies in the field, and ms-word is used to maintain the paper-based version.... (and some management egos are involved, not to mention budget).... at the same time, we want to transform the wordprocessed version to electronic display..... the "mystery character" looks like the underscore character but is actually a bit shorter, and when viewed with frontpage on the split-screen, each character represents a "chunk" of html code, but there is no underscore character in the code string..... therefore, i can not "search and replace".... the object of the exercise is to be able to transform a legacy document over to electronic display as automatically as possible, with only the bare minimum of cosmetic "touch-up" with front page..... i.e keeping the human intervention to an absolute minimum.... i couldn't provide the output image as it would not copy to this panel.... however, here is the html code for three lines with the mystery character that displays in frontpage and ms-explorer: when i highlight the character as it displays in the design panel, only the string is highlighted accordingly in the code panel..... but in fact, nothing is supposed to display because it is intended to be a blank line.... p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p the result is an underscore displayed down the center of the page..... the situation occurs intermittently throughout the document.... thank you in advance for your continued interest in this issue and for any further advice you may be able to offer.... jack. "lostinspace" wrote: ----- Original Message ----- From: "jack w. bonney vancouver" Newsgroups: microsoft.public.word.docmanagement Sent: Thursday, July 07, 2005 11:01 PM Subject: ms-word creates html dash problem for front page greetings from canada!! i am using ms-word to transform/create a word-mode file into an html-mode file (about a manual of about 350+ pages)..... (using the "save as .html" function). background to the application: the html file has been run through the html filter to eliminate the ms-word superflous coding that would normally enable "full circle" editing using ms-word.... we do not need that ms-word coding as the html file will not be going back to ms-word - and although there maybe some minor editing by ms-frontpage, the file will then be passed onward to a custom program that automatically generates thousands of hyperlinks as well as a permuted cross-reference from the content itself.... my problem: regardless of whether the html filter is used or not, the html output from ms-word contains a small flaw: on some blank lines throughout the html-mode file, there is a "_" (i.e. looks like an "underscore" character) in column 1 -- my request for assistance: i would like to know how to completely prevent this character from ever occurring so that we might alter our word processing text entry rules to avoid its presence in the html-mode file.... any suggestions will be very much appreciated.... jack bonney Hello Jack, Ever hear of Patterson Park ;-))) Your first three paragraphs are in terrible conflict. It's impossible to create a html page from Word that does not include both some invalid and deprecated html. ( A simple example is in bolding fonts. b/b) You haven't specified which version of Word your using which may assist the MVP's? I'm assuming that you started from scratch and created your document in Word with both font and page layout formatting? Perhaps even some other Word goodies added in? My suggestion is to take your entire document of approx 350 pages and save it as a STRAIGHT-TEXT file with not a solitary piece of formatting or layout included and create your html pages from that text. (It's the only was to avoid such simple errors as you currently face.) The older versions of FrontPage were no better when using fonts or components than Word. FP also used deprecated html. MS has said that 2003 FP (Standalone) would be better, however I've heard from others that's not so. In your second last paragraph you provide the following: " there is a '_' (i.e. looks like an 'underscore' character) in column -- " There is conflict here as well? Is it an underscore or a double-hyphen and is it surrounded by quotes or have you just used the quotes for emphasis? I'm inclined to believe that these mystery characters are weaknesses in the html cleaner. Nor do I believe you'll find a setting in Word which will control the use of these characters (as least as related to the html component.) If the quotes are there? A search and replace with any text editor would be very easy to do. If the underscore is the mystery character than it's possibly related to the image links that Word creates. Before composing this reply to you? I used word to create my only ever, 2nd html page with Word. I have numerous settings in Word (as related to html) turned off as I dug for solutions for others. I copied a complex CSS-HTML page into Word and then saved as html (with the aforementioned settings.) I do not have the html cleaner for Word2000 installed as I never intend to use Word to create web pages. The result of the copy and save as related to the viewing of the html afterward was a far cry from being anywhere in the same continent as clean html. Jack, Aye! Patterson Park a trotting park. I'm not sure of the then or today location. (1960 reference) some seem to believe that Patterson is today's Sandown. At one time I was attempting to trace a trotting circuit which began in the spring in locations throught NorthWest Canada and ended in the Fall in the Wash, Oregon and California (NorthWest US.) I didn't have any success as the documentation for these things are rather skimpy. Given your method of creation for the doc and with faces to save? Your options are going to be very limited. I don't believe either Word or the html cleaner will help. Nor do I believe you'll find and automated method of removing mystery characters. I downloaded the Standalone html cleaner that Bob Buckland provided a link to and ran the aforementioned page that I created. The result was that the cleaner removed most everything, however some items (such as absolute paragraph position) still remain. [Even though I had NOT any absolute paragraph positioning in the original web page.] I copy and pasted the three html lines that you provided into the html option of FrontPage and was left with a blank web page. No mystery characters. (Leads me to believe that something exclusive to you end [server or OS] is the cause. Although you did previously add that the mystery character were not appearing consistently. Additionally, most everything contained in the three lines of html that you provide are eith invalid or deprecated html. Thos three lines should read: p/p p/p p/p And no more. Anything in excess, is bloat caused by either Word, the html cleaner or FrontPage. It's still my opinion that the most effective and most efficient method is for you to start from scratch with basic text and design your layout with CSS/html. The laternative for server side is PHP/MySQL. Web pages created by Word only make your already complicated situation, more complicated. In the long run, that lack of a solution today will provide you far more headaches in the future. As far as Dreamweaver? Many folks provide that's a very useful software, while others provide that it compares in many ways to FP in creating invalid HTML. However using DW as option (that was expressed) as an html cleaner may be an option to explore, although not worth the price of purchase. Unless Bob Buckland or one of the others are able to provide additional insight? I don't see that many options for you that will provide what you desire to accomplish. |
#8
|
|||
|
|||
to all who have contributed to addressing my problem regarding the mystery
underline character appearing as a result of the transform from .doc to .html using ms-word.... well, my technical support team has a motto: "we can do anything".... and while every now and again, i do have reservations on the accuracy of that motto, they have never let me down..... so, given all the information that you folks contributed, we have managed to resolve the issue by simply programming the darn stuff out of the file..... in case, you are interested in the having a piece of the problem for analysis, i have copied to this panel: p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;mso-bidi-font-size:10.0pt'![if !supportEmptyParas] ![endif]o/o/span/u/p the precise "bug" starts at   and extends to p although i am puzzled as to how it gets there, i can imagine it has to do with the metamorphoses that the file has gone thru.... thanks again for your help - we appear to be back on the tracks.... best regards, jack bonney. "Bob Buckland ?:-)" wrote: Hi Jack, To add to the reply from lostinspace, without knowing the content of the Word .doc file, a snippet of the HTML that would show the 'mystery' character and the version of Word you're using it's not easy to have an idea on where to focus attention. If you're using the Office 2000 HTML filter, you'll usually find that using the standalone MSFilter.exe tool (Start=Programs=Microsoft Office Tools as well) will give you more options for stripping out the information than if you're using File=Export to HTML from inside of Word. This article has additional information on what the filter can remove. http://office.microsoft.com/en-us/as...549981033.aspx ======== "jack w. bonney vancouver" wrote in message ... greetings from canada!! i am using ms-word to transform/create a word-mode file into an html-mode file (about a manual of about 350+ pages)..... (using the "save as .html" function). background to the application: the html file has been run through the html filter to eliminate the ms-word superflous coding that would normally enable "full circle" editing using ms-word.... we do not need that ms-word coding as the html file will not be going back to ms-word - and although there maybe some minor editing by ms-frontpage, the file will then be passed onward to a custom program that automatically generates thousands of hyperlinks as well as a permuted cross-reference from the content itself.... my problem: regardless of whether the html filter is used or not, the html output from ms-word contains a small flaw: on some blank lines throughout the html-mode file, there is a "_" (i.e. looks like an "underscore" character) in column 1 -- my request for assistance: i would like to know how to completely prevent this character from ever occurring so that we might alter our word processing text entry rules to avoid its presence in the html-mode file.... any suggestions will be very much appreciated.... jack bonney -- Let us know if this helped you, Bob Buckland ?:-) MS Office System Products MVP *Courtesy is not expensive and can pay big dividends* For Everyday MS Office tips to "use right away" - http://microsoft.com/events/series/a...andtricks.mspx |
#9
|
|||
|
|||
"the precise "bug" starts at   and extends to p"
This character is the ESC code for the space bar and is inserted by FrontPage in ALL open space paragraphs. In FP If you hit return and insert a line break (to the next paragraph,) FP will insert an entire row of these characters into the html, the number of which I've never counted when deleting. |
#10
|
|||
|
|||
lostinspace shared this with us in microsoft.public.word.docmanagement:
amadee - thank you for your fast response to my question.... patterson park is the trotting track in ladner, right?? I don't know? I'm from .be. unfortunately, at the moment, it is not a perfect world out there!! the original document follows a typical pattern of evolution: it was created by collaboration between 4 amateurs using word 3.1 (1995) with a little bit of wordperfect thrown in for good measure.... in other words, it was a dog's breakfast!! Ouch! everything was converted/consolidate under word 97 and subsequently migrated up to its current platform of windows xp and ms-word 2000.... anyway, all the word processing must be preserved because there are several hundreds of copies in the field, and ms-word is used to maintain the paper-based version.... (and some management egos are involved, not to mention budget).... Ouch! PHB-alert!!! the "mystery character" looks like the underscore character but is actually a bit shorter, and when viewed with frontpage on the split-screen, each character represents a "chunk" of html code, but there is no underscore character in the code string..... therefore, i can not "search and replace".... The underscore part is u /u, which is BAD html: it is deprecated in HTML 4.0. The recommended alternative is a style (css): text-decoration: underline; the object of the exercise is to be able to transform a legacy document over to electronic display as automatically as possible, with only the bare minimum of cosmetic "touch-up" with front page..... i.e keeping the human intervention to an absolute minimum.... There exists html cleanup software that works in batch. GIYF. i couldn't provide the output image as it would not copy to this panel.... Put it on public webspace and provide a link. however, here is the html code for three lines with the mystery character that displays in frontpage and ms-explorer: when i highlight the character as it displays in the design panel, only the string is highlighted accordingly in the code panel..... but in fact, nothing is supposed to display because it is intended to be a blank line.... p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p 3 empty centered underlinded paragraphs with a large font. the result is an underscore displayed down the center of the page..... the situation occurs intermittently throughout the document.... Let me guess: near titles, or where titles used to be? Given your method of creation for the doc and with faces to save? Your options are going to be very limited. I don't believe either Word or the html cleaner will help. Nor do I believe you'll find and automated method of removing mystery characters. I downloaded the Standalone html cleaner that Bob Buckland provided a link to and ran the aforementioned page that I created. The result was that the cleaner removed most everything, however some items (such as absolute paragraph position) still remain. [Even though I had NOT any absolute paragraph positioning in the original web page.] What html cleaner was that? I use Absolute Html Compressor, and I'm investigating Linux/CygWin based code cleanders. Additionally, most everything contained in the three lines of html that you provide are eith invalid or deprecated html. Thos three lines should read: p/p p/p p/p And no more. Anything in excess, is bloat caused by either Word, the html cleaner or FrontPage. I totally agree. It's still my opinion that the most effective and most efficient method is for you to start from scratch with basic text and design your layout with CSS/html. The laternative for server side is PHP/MySQL. I agree. See above for a CSS example. Read the html/css specs for more info ;-) Web pages created by Word only make your already complicated situation, more complicated. In the long run, that lack of a solution today will provide you far more headaches in the future. In other words: invest now to avoid hidden costs in the future. As far as Dreamweaver? Many folks provide that's a very useful software, while others provide that it compares in many ways to FP in creating invalid HTML. That *seriously* depends on the version *and* some configuration options. DW 2004 for example creates almost 100% compliant HTML out of the box. But unfortunately a lot of people tinker with the configuration to make it work like previous versions, or it's the result of some really bad hand-hacking. However using DW as option (that was expressed) as an html cleaner may be an option to explore, although not worth the price of purchase. If you already happen to have DreamWeaver (don't bother to buy it if you don't have it) it is indeed a good html cleaner. You can tell it to clean Word html, and it does a rather good job. In this case, it totally deletes your 3 underlined centered empty paragraphs. Unless Bob Buckland or one of the others are able to provide additional insight? I don't see that many options for you that will provide what you desire to accomplish. I have not tried Nvu yet, but it promises to be a DreamWeaver clone. There is no harm trying it, because it is Free(libre) and therefor free(gratis). www.nvu.com -- Amedee Van Gasse using XanaNews 1.17.5.7 If it has an "X" in the name, it must be Linux? |
#11
|
|||
|
|||
Hi Jack,
As you mentioned, you need to maintain the Word version of the file. What your snippet shows is that there is a typed space that has underlining applied to it and that space is centered on the page - a guess would be that it was a visual paragraph divider at one time, as the font size is set to 28pts, about 1/4" in height. The interaction between Word 2000 and Frontpage 2000 via the clipboard could produce some unexpected results as FrontPage 2000 hadn't been fully updated to handle what Word 2000 was putting on the clipboard as 'HTML' format. Within a spare *copy* the Word document itself see wha happens if you use Edit=Replaceand there use the [More] choice then [Format]=Font andset it to look for underlined text and 28 point font size. In the 'Find what box' type a space then ^p and in replace with, just to give you a visual check, type XXXX then do a replace all. If that 'finds' the text (i.e. replaces it with XXXX) you can click the undo button on the Word toolbar and in the 'replace with' box leave it blank (replace with nothing). ========= "jack w. bonney vancouver" wrote in message ... amadee - thank you for your fast response to my question.... patterson park is the trotting track in ladner, right?? unfortunately, at the moment, it is not a perfect world out there!! the original document follows a typical pattern of evolution: it was created by collaboration between 4 amateurs using word 3.1 (1995) with a little bit of wordperfect thrown in for good measure.... in other words, it was a dog's breakfast!! everything was converted/consolidate under word 97 and subsequently migrated up to its current platform of windows xp and ms-word 2000.... anyway, all the word processing must be preserved because there are several hundreds of copies in the field, and ms-word is used to maintain the paper-based version.... (and some management egos are involved, not to mention budget).... at the same time, we want to transform the wordprocessed version to electronic display..... the "mystery character" looks like the underscore character but is actually a bit shorter, and when viewed with frontpage on the split-screen, each character represents a "chunk" of html code, but there is no underscore character in the code string..... therefore, i can not "search and replace".... the object of the exercise is to be able to transform a legacy document over to electronic display as automatically as possible, with only the bare minimum of cosmetic "touch-up" with front page..... i.e keeping the human intervention to an absolute minimum.... i couldn't provide the output image as it would not copy to this panel.... however, here is the html code for three lines with the mystery character that displays in frontpage and ms-explorer: when i highlight the character as it displays in the design panel, only the string is highlighted accordingly in the code panel..... but in fact, nothing is supposed to display because it is intended to be a blank line.... p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p p class=MsoNormal align=center style='text-align:center'uspan style='font-size:28.0pt;' /span/u/p the result is an underscore displayed down the center of the page..... the situation occurs intermittently throughout the document.... thank you in advance for your continued interest in this issue and for any further advice you may be able to offer.... jack -- Let us know if this helped you, Bob Buckland ?:-) MS Office System Products MVP *Courtesy is not expensive and can pay big dividends* For Everyday MS Office tips to "use right away" - http://microsoft.com/events/series/a...andtricks.mspx |
Reply |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Please give us REVEAL CODES like WORD PERFECT not reveal codes in. | Microsoft Word Help | |||
Word 2003 Mailmerge problem - works in Word 2000 | Mailmerge | |||
letters - ask/fillin | New Users | |||
URL in Word Doc does not work: Page Cannot Be Found | Microsoft Word Help | |||
Boiletplates from Word Perfect | Microsoft Word Help |