View Single Post
  #7   Report Post  
Posted to microsoft.public.word.docmanagement
Don Don is offline
external usenet poster
 
Posts: 103
Default Copying content of HTML Document

?B?bW0xOA==?= wrote in
:

Sometime ago, I posted a question on the Microsoft forum that involved
wanting to keep the original creation date of a Word document when I
retrieved it. The Word program was changing the date to the current
date. A Microsoft adviser responded with the suggested changes I
needed to make.

However, I now have another problem because of this. If I copy some
text from a Web HTML document and copy it to Word, the document is
full of hyperlinks and the text is unreadable unless I manually go
through the text and convert all the hyperlinks to text. This is not a
usable solution.

I would greatly appreciate any suggestions.


To begin. . . .in the event that your copying and pasting an entire web
page into Word?
Your likley copying many of the page navagational links or even 3d party
links (very common in web pages today) that are irrelevant to the
material that you intended to copy.
A simple method of reduction of irrelavant material is to use the
websites "print option" which drastically reduces content.

As an alternative to copying the page content into a Word document?
I frequently (if the entire page is relevant) save a local copy using my
browsers SAVE AS option.
In the event just a paragraph or two is relevant to my interests, I copy
and past that portion along with the source URL into either Notepad or an
RTF file.

It's not necessary to co-mingle all these items in a WORD format or
atmosphere today, which results in excessive tasks.
Rather, there are many desktop tool which allow you to view locally
saved materials in the many file formats (different softwares) that your
computer has installed. (I use Copernic Desktop Tool).

Another option for you to consider is viweing the PRINT page-SOURCE html
and making html reductions in that format (not a good idea to attempt
copying html into Word), there exists many tools for removing either ALL
html or selected portions.