View Single Post
  #3   Report Post  
Posted to microsoft.public.word.newusers
Klaus Linke Klaus Linke is offline
external usenet poster
 
Posts: 413
Default Finding duplicate phrases and paragraphs.

"Frank Martin" wrote:
I am copying a rare particular story from many different newsgroups and
pasting the fragments into a Word2003 document.

Is there some way to automatically find duplicated sections of the story
to as to help weld it into one seamless whole?

In the spell checker one can easily do this for duplicated words, but I
need the same thing for duplicated strings, and even sentences.



Hi Frank,

For repeated paragraphs, you could try a wildcard search for

(^13[!^13]@^13)*\1

If a repeated paragraph is found, you'll see it at the start and end of the
selection... though there needs to be at least one paragraph in between.
For repeated paragraphs right next to each other, you could use

^13([!^13]@^13)\1


For repeated sentences or other duplicated strings of some length, you'd
need a more complicated macro.
You could read the whole document into a string. You probably can find
algorithms for finding repeated phrases in the string using Google:
http://en.wikipedia.org/wiki/Longest...string_problem

Regards,
Klaus