Reply
 
Thread Tools Display Modes
  #1   Report Post  
BruceM
 
Posts: n/a
Default Find and Replace anomaly

I have used Find and Replace many times to replace extra paragraph marks and
paragraph marks that occur at the end of every line (typical of things copied
from web pages). Now I am in a situation where a government representative
(we are in a regulated industry) wants to see where we keep a particular
regulation for reference. The twist is that the regulation is not available
from the government in printable form, yet the web site's copy is not
considered adequate for our records. That leaves me to copy from the web
site and attempt to make it into a document. I have done this before, but
this one is different. I have replaced all styles with Normal (for now; I
will apply custom styles later), used a macro to remove all hyperlinks, used
Find and Replace to remove all graphics. Here's the problem: I cannot use
Find and Replace to replace a succession of paragraph marks with a single
paragraph mark. I can do it in any other document, but not in this one. If
I copy a succession of two paragraphs from the document to a new document I
get the same result (it doesn't identify the successive paragraphs as being
two paragraphs), but if I add paragraphs to the new document with the Enter
key Find and Replace works as it should. Similarly, when I add empty
paragraphs to the troublesome document I can find them as I would expect. I
have tried a wildcard search (Find ^13{2,}, Replace With ^p), and without
wildcards (Find ^p^p, Replace With ^p). No luck. If I search for a single
paragraph I can find every one, including both in the pair. If I replace
every paragraph mark with, say, a £, then attempt to replace every instance
of ££ with £, same problem as with the paragraphs: it does not recognize it
as a pair.
There is nothing such as a space between the paragraphs. I have removed all
manual formatting, hyperlinks, graphics, etc. In short, everything in the
document is part of the ASCII extended character set. I replaced ^13 with
^p, and ^p with ^p (with and without wildcards respectively). I copied the
entire document to Notepad, then opened that with Word. In every case, same
result.
Anybody have an idea as to what is going on here?
  #2   Report Post  
Jay Freedman
 
Posts: n/a
Default

Hi Bruce,

The characters could be manual line breaks instead of paragraph marks. The
Find code is ^l (a lower case ell). Turn on nonprinting character display
and check the line ends. A manual break looks like a left-pointing arrow
with a hooked tail.

See http://word.mvps.org/FAQs/Formatting/CleanWebText.htm for more help.

--
Regards,
Jay Freedman
Microsoft Word MVP FAQ: http://word.mvps.org

BruceM wrote:
I have used Find and Replace many times to replace extra paragraph
marks and paragraph marks that occur at the end of every line
(typical of things copied from web pages). Now I am in a situation
where a government representative (we are in a regulated industry)
wants to see where we keep a particular regulation for reference.
The twist is that the regulation is not available from the government
in printable form, yet the web site's copy is not considered adequate
for our records. That leaves me to copy from the web site and
attempt to make it into a document. I have done this before, but
this one is different. I have replaced all styles with Normal (for
now; I will apply custom styles later), used a macro to remove all
hyperlinks, used Find and Replace to remove all graphics. Here's the
problem: I cannot use Find and Replace to replace a succession of
paragraph marks with a single paragraph mark. I can do it in any
other document, but not in this one. If I copy a succession of two
paragraphs from the document to a new document I get the same result
(it doesn't identify the successive paragraphs as being two
paragraphs), but if I add paragraphs to the new document with the
Enter key Find and Replace works as it should. Similarly, when I add
empty paragraphs to the troublesome document I can find them as I
would expect. I have tried a wildcard search (Find ^13{2,}, Replace
With ^p), and without wildcards (Find ^p^p, Replace With ^p). No
luck. If I search for a single paragraph I can find every one,
including both in the pair. If I replace every paragraph mark with,
say, a £, then attempt to replace every instance of ££ with £, same
problem as with the paragraphs: it does not recognize it as a pair.
There is nothing such as a space between the paragraphs. I have
removed all manual formatting, hyperlinks, graphics, etc. In short,
everything in the document is part of the ASCII extended character
set. I replaced ^13 with ^p, and ^p with ^p (with and without
wildcards respectively). I copied the entire document to Notepad,
then opened that with Word. In every case, same result. Anybody have
an idea as to what is going on here?



  #3   Report Post  
lostinspace
 
Posts: n/a
Default

----- Original Message -----
From: "BruceM"
Newsgroups: microsoft.public.word.docmanagement
Sent: Monday, January 17, 2005 11:43 AM
Subject: Find and Replace anomaly


I have used Find and Replace many times to replace extra paragraph marks
and
paragraph marks that occur at the end of every line (typical of things
copied
from web pages). Now I am in a situation where a government
representative
(we are in a regulated industry) wants to see where we keep a particular
regulation for reference. The twist is that the regulation is not
available
from the government in printable form, yet the web site's copy is not
considered adequate for our records. That leaves me to copy from the web
site and attempt to make it into a document. I have done this before, but
this one is different. I have replaced all styles with Normal (for now; I
will apply custom styles later), used a macro to remove all hyperlinks,
used
Find and Replace to remove all graphics. Here's the problem: I cannot
use
Find and Replace to replace a succession of paragraph marks with a single
paragraph mark. I can do it in any other document, but not in this one.
If
I copy a succession of two paragraphs from the document to a new document
I
get the same result (it doesn't identify the successive paragraphs as
being
two paragraphs), but if I add paragraphs to the new document with the
Enter
key Find and Replace works as it should. Similarly, when I add empty
paragraphs to the troublesome document I can find them as I would expect.
I
have tried a wildcard search (Find ^13{2,}, Replace With ^p), and without
wildcards (Find ^p^p, Replace With ^p). No luck. If I search for a
single
paragraph I can find every one, including both in the pair. If I replace
every paragraph mark with, say, a £, then attempt to replace every
instance
of ££ with £, same problem as with the paragraphs: it does not recognize
it
as a pair.
There is nothing such as a space between the paragraphs. I have removed
all
manual formatting, hyperlinks, graphics, etc. In short, everything in the
document is part of the ASCII extended character set. I replaced ^13 with
^p, and ^p with ^p (with and without wildcards respectively). I copied
the
entire document to Notepad, then opened that with Word. In every case,
same
result.
Anybody have an idea as to what is going on here?


Bruce,
Likely the best service you could do in assiting yourself would
be in providing the URL for the page your attempting to convert?

CSS and html have formatting options which display spacing and such beyond
Word's formatting options.

One alternative option may be to print the web page to a PDF file retaning
all formatting in the process.


  #4   Report Post  
BruceM
 
Posts: n/a
Default

Thanks for taking the time to reply. I guess I should have mentioned I
displayed nonprinting characters. I finally figured out what it was (sort
of). At any rate I made the problem go away. Scattered throughout the
document was a sort of right angle arrow pointing up (a graphic, not a line
break) followed by the word "top" as a hyperlink followed by a paragraph
mark. I used ^g to get rid of the graphics, then replaced "top" in hyperlink
character style with "top" in Normal style, then I replaced top^p with
nothing. It was the paragraph mark after the hyperlink that went weird on me.

"Jay Freedman" wrote:

Hi Bruce,

The characters could be manual line breaks instead of paragraph marks. The
Find code is ^l (a lower case ell). Turn on nonprinting character display
and check the line ends. A manual break looks like a left-pointing arrow
with a hooked tail.

See http://word.mvps.org/FAQs/Formatting/CleanWebText.htm for more help.

--
Regards,
Jay Freedman
Microsoft Word MVP FAQ: http://word.mvps.org

BruceM wrote:
I have used Find and Replace many times to replace extra paragraph
marks and paragraph marks that occur at the end of every line
(typical of things copied from web pages). Now I am in a situation
where a government representative (we are in a regulated industry)
wants to see where we keep a particular regulation for reference.
The twist is that the regulation is not available from the government
in printable form, yet the web site's copy is not considered adequate
for our records. That leaves me to copy from the web site and
attempt to make it into a document. I have done this before, but
this one is different. I have replaced all styles with Normal (for
now; I will apply custom styles later), used a macro to remove all
hyperlinks, used Find and Replace to remove all graphics. Here's the
problem: I cannot use Find and Replace to replace a succession of
paragraph marks with a single paragraph mark. I can do it in any
other document, but not in this one. If I copy a succession of two
paragraphs from the document to a new document I get the same result
(it doesn't identify the successive paragraphs as being two
paragraphs), but if I add paragraphs to the new document with the
Enter key Find and Replace works as it should. Similarly, when I add
empty paragraphs to the troublesome document I can find them as I
would expect. I have tried a wildcard search (Find ^13{2,}, Replace
With ^p), and without wildcards (Find ^p^p, Replace With ^p). No
luck. If I search for a single paragraph I can find every one,
including both in the pair. If I replace every paragraph mark with,
say, a £, then attempt to replace every instance of ££ with £, same
problem as with the paragraphs: it does not recognize it as a pair.
There is nothing such as a space between the paragraphs. I have
removed all manual formatting, hyperlinks, graphics, etc. In short,
everything in the document is part of the ASCII extended character
set. I replaced ^13 with ^p, and ^p with ^p (with and without
wildcards respectively). I copied the entire document to Notepad,
then opened that with Word. In every case, same result. Anybody have
an idea as to what is going on here?




  #5   Report Post  
Suzanne S. Barnhill
 
Posts: n/a
Default

Could it have been a text-wrapping break? They look a lot like line breaks,
and AFAIK there is no way to search for them using Find.

--
Suzanne S. Barnhill
Microsoft MVP (Word)
Words into Type
Fairhope, Alabama USA
Word MVP FAQ site: http://word.mvps.org
Email cannot be acknowledged; please post all follow-ups to the newsgroup so
all may benefit.

"BruceM" wrote in message
...
Thanks for taking the time to reply. I guess I should have mentioned I
displayed nonprinting characters. I finally figured out what it was (sort
of). At any rate I made the problem go away. Scattered throughout the
document was a sort of right angle arrow pointing up (a graphic, not a

line
break) followed by the word "top" as a hyperlink followed by a paragraph
mark. I used ^g to get rid of the graphics, then replaced "top" in

hyperlink
character style with "top" in Normal style, then I replaced top^p with
nothing. It was the paragraph mark after the hyperlink that went weird on

me.

"Jay Freedman" wrote:

Hi Bruce,

The characters could be manual line breaks instead of paragraph marks.

The
Find code is ^l (a lower case ell). Turn on nonprinting character

display
and check the line ends. A manual break looks like a left-pointing arrow
with a hooked tail.

See http://word.mvps.org/FAQs/Formatting/CleanWebText.htm for more help.

--
Regards,
Jay Freedman
Microsoft Word MVP FAQ: http://word.mvps.org

BruceM wrote:
I have used Find and Replace many times to replace extra paragraph
marks and paragraph marks that occur at the end of every line
(typical of things copied from web pages). Now I am in a situation
where a government representative (we are in a regulated industry)
wants to see where we keep a particular regulation for reference.
The twist is that the regulation is not available from the government
in printable form, yet the web site's copy is not considered adequate
for our records. That leaves me to copy from the web site and
attempt to make it into a document. I have done this before, but
this one is different. I have replaced all styles with Normal (for
now; I will apply custom styles later), used a macro to remove all
hyperlinks, used Find and Replace to remove all graphics. Here's the
problem: I cannot use Find and Replace to replace a succession of
paragraph marks with a single paragraph mark. I can do it in any
other document, but not in this one. If I copy a succession of two
paragraphs from the document to a new document I get the same result
(it doesn't identify the successive paragraphs as being two
paragraphs), but if I add paragraphs to the new document with the
Enter key Find and Replace works as it should. Similarly, when I add
empty paragraphs to the troublesome document I can find them as I
would expect. I have tried a wildcard search (Find ^13{2,}, Replace
With ^p), and without wildcards (Find ^p^p, Replace With ^p). No
luck. If I search for a single paragraph I can find every one,
including both in the pair. If I replace every paragraph mark with,
say, a £, then attempt to replace every instance of ££ with £, same
problem as with the paragraphs: it does not recognize it as a pair.
There is nothing such as a space between the paragraphs. I have
removed all manual formatting, hyperlinks, graphics, etc. In short,
everything in the document is part of the ASCII extended character
set. I replaced ^13 with ^p, and ^p with ^p (with and without
wildcards respectively). I copied the entire document to Notepad,
then opened that with Word. In every case, same result. Anybody have
an idea as to what is going on here?







  #6   Report Post  
BruceM
 
Posts: n/a
Default

No, not text-wrapping. What was so strange was that no matter what character
I substituted for the paragraphs marks, Find could not locate the one that
was originally on the line with the hyperlink.

"Suzanne S. Barnhill" wrote:

Could it have been a text-wrapping break? They look a lot like line breaks,
and AFAIK there is no way to search for them using Find.

--
Suzanne S. Barnhill
Microsoft MVP (Word)
Words into Type
Fairhope, Alabama USA
Word MVP FAQ site: http://word.mvps.org
Email cannot be acknowledged; please post all follow-ups to the newsgroup so
all may benefit.

"BruceM" wrote in message
...
Thanks for taking the time to reply. I guess I should have mentioned I
displayed nonprinting characters. I finally figured out what it was (sort
of). At any rate I made the problem go away. Scattered throughout the
document was a sort of right angle arrow pointing up (a graphic, not a

line
break) followed by the word "top" as a hyperlink followed by a paragraph
mark. I used ^g to get rid of the graphics, then replaced "top" in

hyperlink
character style with "top" in Normal style, then I replaced top^p with
nothing. It was the paragraph mark after the hyperlink that went weird on

me.

"Jay Freedman" wrote:

Hi Bruce,

The characters could be manual line breaks instead of paragraph marks.

The
Find code is ^l (a lower case ell). Turn on nonprinting character

display
and check the line ends. A manual break looks like a left-pointing arrow
with a hooked tail.

See http://word.mvps.org/FAQs/Formatting/CleanWebText.htm for more help.

--
Regards,
Jay Freedman
Microsoft Word MVP FAQ: http://word.mvps.org

BruceM wrote:
I have used Find and Replace many times to replace extra paragraph
marks and paragraph marks that occur at the end of every line
(typical of things copied from web pages). Now I am in a situation
where a government representative (we are in a regulated industry)
wants to see where we keep a particular regulation for reference.
The twist is that the regulation is not available from the government
in printable form, yet the web site's copy is not considered adequate
for our records. That leaves me to copy from the web site and
attempt to make it into a document. I have done this before, but
this one is different. I have replaced all styles with Normal (for
now; I will apply custom styles later), used a macro to remove all
hyperlinks, used Find and Replace to remove all graphics. Here's the
problem: I cannot use Find and Replace to replace a succession of
paragraph marks with a single paragraph mark. I can do it in any
other document, but not in this one. If I copy a succession of two
paragraphs from the document to a new document I get the same result
(it doesn't identify the successive paragraphs as being two
paragraphs), but if I add paragraphs to the new document with the
Enter key Find and Replace works as it should. Similarly, when I add
empty paragraphs to the troublesome document I can find them as I
would expect. I have tried a wildcard search (Find ^13{2,}, Replace
With ^p), and without wildcards (Find ^p^p, Replace With ^p). No
luck. If I search for a single paragraph I can find every one,
including both in the pair. If I replace every paragraph mark with,
say, a £, then attempt to replace every instance of ££ with £, same
problem as with the paragraphs: it does not recognize it as a pair.
There is nothing such as a space between the paragraphs. I have
removed all manual formatting, hyperlinks, graphics, etc. In short,
everything in the document is part of the ASCII extended character
set. I replaced ^13 with ^p, and ^p with ^p (with and without
wildcards respectively). I copied the entire document to Notepad,
then opened that with Word. In every case, same result. Anybody have
an idea as to what is going on here?





  #7   Report Post  
Martin P
 
Posts: n/a
Default

One way to get hold of these unidentifiable characters is to get an
exhaustive list of what they are not, together with the the exclamation mark,
such as [!A-Za-z0-9]. It is not difficult to create an exhaustive list. In
a duplicate document, get rid of characters until almost nothing remains.
Start off with, say, [A-Za-z0-9], change that to nothing and add characters
that remain. Once you have the list, add the exclamation mark and use that
in the original document..

"Suzanne S. Barnhill" wrote:

Could it have been a text-wrapping break? They look a lot like line breaks,
and AFAIK there is no way to search for them using Find.

--
Suzanne S. Barnhill
Microsoft MVP (Word)
Words into Type
Fairhope, Alabama USA
Word MVP FAQ site: http://word.mvps.org
Email cannot be acknowledged; please post all follow-ups to the newsgroup so
all may benefit.

"BruceM" wrote in message
...
Thanks for taking the time to reply. I guess I should have mentioned I
displayed nonprinting characters. I finally figured out what it was (sort
of). At any rate I made the problem go away. Scattered throughout the
document was a sort of right angle arrow pointing up (a graphic, not a

line
break) followed by the word "top" as a hyperlink followed by a paragraph
mark. I used ^g to get rid of the graphics, then replaced "top" in

hyperlink
character style with "top" in Normal style, then I replaced top^p with
nothing. It was the paragraph mark after the hyperlink that went weird on

me.

"Jay Freedman" wrote:

Hi Bruce,

The characters could be manual line breaks instead of paragraph marks.

The
Find code is ^l (a lower case ell). Turn on nonprinting character

display
and check the line ends. A manual break looks like a left-pointing arrow
with a hooked tail.

See http://word.mvps.org/FAQs/Formatting/CleanWebText.htm for more help.

--
Regards,
Jay Freedman
Microsoft Word MVP FAQ: http://word.mvps.org

BruceM wrote:
I have used Find and Replace many times to replace extra paragraph
marks and paragraph marks that occur at the end of every line
(typical of things copied from web pages). Now I am in a situation
where a government representative (we are in a regulated industry)
wants to see where we keep a particular regulation for reference.
The twist is that the regulation is not available from the government
in printable form, yet the web site's copy is not considered adequate
for our records. That leaves me to copy from the web site and
attempt to make it into a document. I have done this before, but
this one is different. I have replaced all styles with Normal (for
now; I will apply custom styles later), used a macro to remove all
hyperlinks, used Find and Replace to remove all graphics. Here's the
problem: I cannot use Find and Replace to replace a succession of
paragraph marks with a single paragraph mark. I can do it in any
other document, but not in this one. If I copy a succession of two
paragraphs from the document to a new document I get the same result
(it doesn't identify the successive paragraphs as being two
paragraphs), but if I add paragraphs to the new document with the
Enter key Find and Replace works as it should. Similarly, when I add
empty paragraphs to the troublesome document I can find them as I
would expect. I have tried a wildcard search (Find ^13{2,}, Replace
With ^p), and without wildcards (Find ^p^p, Replace With ^p). No
luck. If I search for a single paragraph I can find every one,
including both in the pair. If I replace every paragraph mark with,
say, a £, then attempt to replace every instance of ££ with £, same
problem as with the paragraphs: it does not recognize it as a pair.
There is nothing such as a space between the paragraphs. I have
removed all manual formatting, hyperlinks, graphics, etc. In short,
everything in the document is part of the ASCII extended character
set. I replaced ^13 with ^p, and ^p with ^p (with and without
wildcards respectively). I copied the entire document to Notepad,
then opened that with Word. In every case, same result. Anybody have
an idea as to what is going on here?





  #8   Report Post  
Martin P
 
Posts: n/a
Default

Without knowing what is in the document, here is my shot in the dark.
Paragraph marks are usually after periods, exclamation marks and question
marks. Change ? to £ and ! to ¥. Then, with wildcards enabled, replace
([.£¥])^13 with \1§. Now remove the remaing paragraph marks by replacing ^13
with nothing. Replace £ with ?, ¥ with ! and § with paragraph mark.

"BruceM" wrote:

I have used Find and Replace many times to replace extra paragraph marks and
paragraph marks that occur at the end of every line (typical of things copied
from web pages). Now I am in a situation where a government representative
(we are in a regulated industry) wants to see where we keep a particular
regulation for reference. The twist is that the regulation is not available
from the government in printable form, yet the web site's copy is not
considered adequate for our records. That leaves me to copy from the web
site and attempt to make it into a document. I have done this before, but
this one is different. I have replaced all styles with Normal (for now; I
will apply custom styles later), used a macro to remove all hyperlinks, used
Find and Replace to remove all graphics. Here's the problem: I cannot use
Find and Replace to replace a succession of paragraph marks with a single
paragraph mark. I can do it in any other document, but not in this one. If
I copy a succession of two paragraphs from the document to a new document I
get the same result (it doesn't identify the successive paragraphs as being
two paragraphs), but if I add paragraphs to the new document with the Enter
key Find and Replace works as it should. Similarly, when I add empty
paragraphs to the troublesome document I can find them as I would expect. I
have tried a wildcard search (Find ^13{2,}, Replace With ^p), and without
wildcards (Find ^p^p, Replace With ^p). No luck. If I search for a single
paragraph I can find every one, including both in the pair. If I replace
every paragraph mark with, say, a £, then attempt to replace every instance
of ££ with £, same problem as with the paragraphs: it does not recognize it
as a pair.
There is nothing such as a space between the paragraphs. I have removed all
manual formatting, hyperlinks, graphics, etc. In short, everything in the
document is part of the ASCII extended character set. I replaced ^13 with
^p, and ^p with ^p (with and without wildcards respectively). I copied the
entire document to Notepad, then opened that with Word. In every case, same
result.
Anybody have an idea as to what is going on here?

Reply
Thread Tools
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
A "Ctrl + Page Down" doe a "find and replace", Why? rarrigo Microsoft Word Help 1 January 13th 05 10:00 PM
How can I replace a paragraph with another paragraph using "find . tml0701 Microsoft Word Help 1 January 5th 05 02:43 PM
--A Find & Replace Question-- KlikThis Microsoft Word Help 3 January 1st 05 06:58 PM
Find & Replace Bookmarks jbc Microsoft Word Help 1 December 24th 04 10:03 PM
Find Replace bold formatting with delimiter Steve Microsoft Word Help 7 December 7th 04 04:41 PM


All times are GMT +1. The time now is 09:55 PM.

Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 Microsoft Office Word Forum - WordBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Word"