View Single Post
  #7   Report Post  
lostinspace
 
Posts: n/a
Default

----- Original Message -----
From: "jack w. bonney vancouver"
Newsgroups: microsoft.public.word.docmanagement
Sent: Friday, July 08, 2005 10:20 AM
Subject: ms-word creates html dash problem for front page & ms-explorer


amadee - thank you for your fast response to my question.... patterson
park
is the trotting track in ladner, right??

unfortunately, at the moment, it is not a perfect world out there!! the
original document follows a typical pattern of evolution: it was created
by
collaboration between 4 amateurs using word 3.1 (1995) with a little bit
of
wordperfect thrown in for good measure.... in other words, it was a dog's
breakfast!!

everything was converted/consolidate under word 97 and subsequently
migrated up to its current platform of windows xp and ms-word 2000....

anyway, all the word processing must be preserved because there are
several
hundreds of copies in the field, and ms-word is used to maintain the
paper-based version.... (and some management egos are involved, not to
mention budget)....

at the same time, we want to transform the wordprocessed version to
electronic display.....

the "mystery character" looks like the underscore character but is
actually
a bit shorter, and when viewed with frontpage on the split-screen, each
character represents a "chunk" of html code, but there is no underscore
character in the code string..... therefore, i can not "search and
replace"....

the object of the exercise is to be able to transform a legacy document
over
to electronic display as automatically as possible, with only the bare
minimum of cosmetic "touch-up" with front page..... i.e keeping the human
intervention to an absolute minimum.... i couldn't provide the output
image
as it would not copy to this panel....

however, here is the html code for three lines with the mystery character
that displays in frontpage and ms-explorer: when i highlight the character
as
it displays in the design panel, only the string is highlighted
accordingly in the code panel..... but in fact, nothing is supposed to
display because it is intended to be a blank line....

p class=MsoNormal align=center style='text-align:center'uspan
style='font-size:28.0pt;' /span/u/p

p class=MsoNormal align=center style='text-align:center'uspan
style='font-size:28.0pt;' /span/u/p

p class=MsoNormal align=center style='text-align:center'uspan
style='font-size:28.0pt;' /span/u/p


the result is an underscore displayed down the center of the page.....
the
situation occurs intermittently throughout the document....

thank you in advance for your continued interest in this issue and for any
further advice you may be able to offer....

jack.




"lostinspace" wrote:

----- Original Message -----
From: "jack w. bonney vancouver"
Newsgroups: microsoft.public.word.docmanagement
Sent: Thursday, July 07, 2005 11:01 PM
Subject: ms-word creates html dash problem for front page


greetings from canada!!

i am using ms-word to transform/create a word-mode file into an
html-mode
file (about a manual of about 350+ pages)..... (using the "save as
.html"
function).

background to the application: the html file has been run through the
html
filter to eliminate the ms-word superflous coding that would normally
enable
"full circle" editing using ms-word....

we do not need that ms-word coding as the html file will not be going
back
to ms-word - and although there maybe some minor editing by
ms-frontpage,
the
file will then be passed onward to a custom program that automatically
generates thousands of hyperlinks as well as a permuted
cross-reference
from
the content itself....

my problem: regardless of whether the html filter is used or not, the
html
output from ms-word contains a small flaw: on some blank lines
throughout
the html-mode file, there is a "_" (i.e. looks like an "underscore"
character) in column 1 --

my request for assistance: i would like to know how to completely
prevent
this character from ever occurring so that we might alter our word
processing
text entry rules to avoid its presence in the html-mode file....

any suggestions will be very much appreciated....

jack bonney




Hello Jack,
Ever hear of Patterson Park ;-)))

Your first three paragraphs are in terrible conflict.
It's impossible to create a html page from Word that does not include
both
some invalid and deprecated html. ( A simple example is in bolding fonts.
b/b)

You haven't specified which version of Word your using which may assist
the
MVP's?

I'm assuming that you started from scratch and created your document in
Word
with both font and page layout formatting? Perhaps even some other Word
goodies added in?

My suggestion is to take your entire document of approx 350 pages and
save
it as a STRAIGHT-TEXT file with not a solitary piece of formatting or
layout
included and create your html pages from that text. (It's the only was to
avoid such simple errors as you currently face.)

The older versions of FrontPage were no better when using fonts or
components than Word. FP also used deprecated html.
MS has said that 2003 FP (Standalone) would be better, however I've
heard
from others that's not so.

In your second last paragraph you provide the following:

" there is a '_' (i.e. looks like an 'underscore' character) in column
--
"

There is conflict here as well?
Is it an underscore or a double-hyphen and is it surrounded by quotes or
have you just used the quotes for emphasis?

I'm inclined to believe that these mystery characters are weaknesses in
the
html cleaner. Nor do I believe you'll find a setting in Word which will
control the use of these characters (as least as related to the html
component.)

If the quotes are there?
A search and replace with any text editor would be very easy to do.

If the underscore is the mystery character than it's possibly related to
the
image links that Word creates.

Before composing this reply to you?
I used word to create my only ever, 2nd html page with Word. I have
numerous
settings in Word (as related to html) turned off as I dug for solutions
for
others.

I copied a complex CSS-HTML page into Word and then saved as html (with
the
aforementioned settings.) I do not have the html cleaner for Word2000
installed as I never intend to use Word to create web pages.

The result of the copy and save as related to the viewing of the html
afterward was a far cry from being anywhere in the same continent as
clean
html.



Jack,
Aye! Patterson Park a trotting park. I'm not sure of the then or
today location. (1960 reference) some seem to believe that Patterson is
today's Sandown. At one time I was attempting to trace a trotting circuit
which began in the spring in locations throught NorthWest Canada and ended
in the Fall in the Wash, Oregon and California (NorthWest US.) I didn't
have any success as the documentation for these things are rather skimpy.

Given your method of creation for the doc and with faces to save?
Your options are going to be very limited.
I don't believe either Word or the html cleaner will help. Nor do I believe
you'll find and automated method of removing mystery characters.

I downloaded the Standalone html cleaner that Bob Buckland provided a link
to and ran the aforementioned page that I created. The result was that the
cleaner removed most everything, however some items (such as absolute
paragraph position) still remain. [Even though I had NOT any absolute
paragraph positioning in the original web page.]

I copy and pasted the three html lines that you provided into the html
option of FrontPage and was left with a blank web page. No mystery
characters. (Leads me to believe that something exclusive to you end [server
or OS] is the cause. Although you did previously add that the mystery
character were not appearing consistently.

Additionally, most everything contained in the three lines of html that you
provide are eith invalid or deprecated html.

Thos three lines should read:

p/p

p/p

p/p

And no more. Anything in excess, is bloat caused by either Word, the html
cleaner or FrontPage.

It's still my opinion that the most effective and most efficient method is
for you to start from scratch with basic text and design your layout with
CSS/html.
The laternative for server side is PHP/MySQL.

Web pages created by Word only make your already complicated situation, more
complicated.
In the long run, that lack of a solution today will provide you far more
headaches in the future.

As far as Dreamweaver?
Many folks provide that's a very useful software, while others provide that
it compares in many ways to FP in creating invalid HTML. However using DW as
option (that was expressed) as an html cleaner may be an option to explore,
although not worth the price of purchase.

Unless Bob Buckland or one of the others are able to provide additional
insight?
I don't see that many options for you that will provide what you desire to
accomplish.