Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
Bizarre wildcard replace
Hi, I'm trying to write a basic word - HTML conversion (since I can't find
any tool that actually does a clean job of it.) For example, I want to find all instances of italics and replace them with i(original text)/i. I've tried doing this in the search-and-replace: Turned on "Use Wildcards", selected Format in the 'find' with Font of "Italic", do a search for '*'. In the replace with, I've set formatting to "Not Italic," and made the replace string i^&/i . I've also tried it with (*) in the find field and i\1/i in the replace field. My problem is that this seems to only be matching one character at a time -- I end up with i and /i around every individual character, instead of around an entire word, part of a word, sentence or paragraph. Any help would be much appreciated. |
#2
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
Bizarre wildcard replace
Word's implementation of regular expressions (which is what you get with
'Use Wildcards' checked) uses minimal matching, as opposed to Unix which uses maximal. In other words it looks for the *smallest* sequence of characters that match the Find expression -- in your case, one character at a time. Which is a damned nuisance. However, there's an easy fix, at least for the example you give: if you don't check the 'Use Wildcards' checkbox you can leave the Find box blank: it will then match the full sequence with the formatting you specify. In the replace box use ^& for the 'find what' text. So: Find: (blank), Format = italic Replace: i^&/i, Format = not italic "cfulmer" wrote in message ... Hi, I'm trying to write a basic word - HTML conversion (since I can't find any tool that actually does a clean job of it.) For example, I want to find all instances of italics and replace them with i(original text)/i. I've tried doing this in the search-and-replace: Turned on "Use Wildcards", selected Format in the 'find' with Font of "Italic", do a search for '*'. In the replace with, I've set formatting to "Not Italic," and made the replace string i^&/i . I've also tried it with (*) in the find field and i\1/i in the replace field. My problem is that this seems to only be matching one character at a time -- I end up with i and /i around every individual character, instead of around an entire word, part of a word, sentence or paragraph. Any help would be much appreciated. |
#3
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
Bizarre wildcard replace
Had you thought of trying a tool named "Microsoft Word 2003"?
It does a *perfect* job of converting a Word document to HTML. That's a lie :-) It does a perfect job of converting a Word document into X-HTML. HTML is not capable of describing a Word document. Word writes XML inline with the HTML to describe the components of the document that HTML can not. If you are working with Word 2003 Enterprise Edition, you will have a tool named InfoPath available. InfoPath enables you to write an XML Transform that would remove the various components you do not want from the XML that Word writes. Otherwise, you will find any number of tools available that will do a greater or lesser version of what you are trying to do. FrontPage does an excellent job of you simply "paste" the text of the Word document into the FrontPage editor. DreamWeaver does a great job of filtering Word's HTML on import. I think making your own is very much a case of re-inventing a wheel that is already trundling down the highway under dozens of cars :-) Cheers On 27/1/06 9:01 AM, in article , "cfulmer" wrote: Hi, I'm trying to write a basic word - HTML conversion (since I can't find any tool that actually does a clean job of it.) For example, I want to find all instances of italics and replace them with i(original text)/i. I've tried doing this in the search-and-replace: Turned on "Use Wildcards", selected Format in the 'find' with Font of "Italic", do a search for '*'. In the replace with, I've set formatting to "Not Italic," and made the replace string i^&/i . I've also tried it with (*) in the find field and i\1/i in the replace field. My problem is that this seems to only be matching one character at a time -- I end up with i and /i around every individual character, instead of around an entire word, part of a word, sentence or paragraph. Any help would be much appreciated. -- Please reply to the newsgroup to maintain the thread. Please do not email me unless I ask you to. John McGhie Microsoft MVP, Word and Word for Macintosh. Consultant Technical Writer Sydney, Australia +61 (0) 4 1209 1410 |
Reply |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
MS Word 2000 Find & Replace | Microsoft Word Help | |||
BUG with replace command | Microsoft Word Help | |||
Can't get Replace to work | Microsoft Word Help | |||
Find and Replace anomaly | Microsoft Word Help | |||
Replace and retain whatever the wildcard represents | Microsoft Word Help |