Home |
Search |
Today's Posts |
#1
![]() |
|||
|
|||
![]()
I'm trying to save Word files containing Unicode characters to plain text so
I can clean them up and code them as Indesign Tagged Text. Save As Encoded Text actually works nicely, except that certain characters sometimes save successfully, and sometimes get converted to a single open parenthesis. I've posted a tiny test file showing this at http://www.pegtype.com/test.doc. The mystery is that the same character in two places in the same file converts differently. Thanks for any help. Ken Benson |
#2
![]() |
|||
|
|||
![]()
Hi Ken,
Interestingly, while the Reveal Formatting Task Pane in Word did not show any formatting differences between your paragraph #1 and #'s 2-5 using File=Web Page Preview and View=Source did show differences and those are apparently enough to confuse the Plain text converter. If you turn on the [x] Show Paste Options button in Tools=Options=Edit then select all of paragraph 1, cut-it (Ctrl+X), repaste it (Ctrl+V) and from the icon select 'keep text only' it corrects the problem. Copying the similar two characters from items 2-5 and pasting them over the problem ones in item 1 also corrected the problem. ======== "Ken Benson" wrote in message ... I'm trying to save Word files containing Unicode characters to plain text so I can clean them up and code them as Indesign Tagged Text. Save As Encoded Text actually works nicely, except that certain characters sometimes save successfully, and sometimes get converted to a single open parenthesis. I've posted a tiny test file showing this at http://www.pegtype.com/test.doc. The mystery is that the same character in two places in the same file converts differently. Thanks for any help. Ken Benson -- Let us know if this helped you, Bob Buckland ?:-) MS Office System Products MVP *Courtesy is not expensive and can pay big dividends* Office 2003 Editions explained http://www.microsoft.com/uk/office/editions.mspx |
#3
![]() |
|||
|
|||
![]()
Hi Bob
Thanks for looking into this. Show Paste Options is probably an option for a newer version of Word (I've got Word 2000), but I seem to be able to accomplish the same thing by using Paste Special|Unformatted Unicode Text. This is a workable solution. Do you have any idea how the original author (I'm several steps removed from him) could have accomplished this? Thank again, Ken Benson |
#4
![]() |
|||
|
|||
![]()
Hi Bob
I came up with an even better solution than "cut/paste as text". I installed OpenOffice, opened my problem file there, resaved it, and then opened it again in Word. All the problem characters were converted nicely. This method both fixes the problem and keeps the formatting. Thanks for your help, Ken Benson |
#5
![]() |
|||
|
|||
![]()
Do you have any idea how the original author (I'm several steps=20
removed from him) could have accomplished this? Hi Ken, He probably used "Insert Symbol", and the font is a symbol = ("decorative") font like Symbol or Wingdings. Since you don't want the symbols to change if you change the font (say = by applying another one, or applying another style), Word inserts those = symbols as a kind of symbol field. But it will never show you the field = code, only the result. The effect is as desired: The field can have any font applied, but the = character will still be inserted from the font specified in the field. The drawback is that Word usually can't tell you the code or font. = AscW(Selection.Text) will return the code 40 ... "(" =3D opening brace, = and the font dropdown will show you the font of the surrounding text. If you select the symbol and open "Insert Symbol" again, Word can = usually tell you the font and the code. This doesn't always work, though. Word expects the codes to be in the = range from U+F000 to U+F0FF. But since Symbol fields don't change if you = add or subtract multiples of 256 from that code, you often get files = with messed up symbols. The WordPerfect import filters seem to be a special culprit in this = regard, but I suspect some other filters aren't working correctly = either. If you only need unformatted text, "Paste Special as text" is a good = solution.=20 If you need to keep the formatting, I have posted a macro to turn = "proper symbol fields" into regular characters... google for = "SymbolsUnprotect". For messed up symbol "fields", I haven't found a good solution, since = you'd have to do a lot of processing for each character which takes too = long even on moderately sized files. Greetings, Klaus |
#6
![]() |
|||
|
|||
![]()
Hi Ken,
Nice!!! =20 I just tried it on a file with "messed up" symbols. That file had given me trouble last week, because the symbols messed up = in InDesign, and I couldn't fix it in Word. I had resorted to editing the RTF code "by hand" with Find/Replace. OO Writer 1.9.97 fixed the symbols fine. I'd keep it installed if only = for that reason.=20 Thanks for the tip! Klaus "Ken Benson" wrote: Hi Bob =20 I came up with an even better solution than "cut/paste as text". I = installed OpenOffice, opened my problem file there, resaved it, and then opened = it again in Word. All the problem characters were converted nicely. This = method both fixes the problem and keeps the formatting. =20 Thanks for your help, Ken Benson =20 |
#7
![]() |
|||
|
|||
![]() "Klaus Linke" wrote in message ... OO Writer 1.9.97 fixed the symbols fine. I'd keep it installed if only for that reason. Yes, I think I'll just be running all my bizarre author files through OpenOffice from now on. Ken Benson |
Reply |
Thread Tools | |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
How do I get rid of a page number on a Save As\ Text Only doc? | Microsoft Word Help | |||
How do i get rid of a page number on a Save As\ Text Only doc? | Microsoft Word Help | |||
Outline | Page Layout | |||
textbox to normal text | New Users | |||
merge data in a header or text box does not save | Mailmerge |