Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
I have a seemingly random and vexing problem. Sometimes, when I send an rtf
or doc file to a client (translations), the Swedish letters åäöÅÄÖ are replaced by Asian-looking characters. What usually happens is that "deff#" in the first line of the file is *not* "deff0", but, e.g., "deff17" (deff# is the font, I understand). What I do not understand is why this happens. In the hope that someone with experience in these matters can provide a clue, or even a solution, I will present below the first row of rtf files I have looked at with UltraEdit (the text editor). There are four files of each, and the file names explain who sent which file to whom. Additional Text_1_Sent by client.rtf {\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\de ff0\stshfdbch13\stshfloch0\stshfhich0\stshfbi0\def lang1033\deflangfe2052{\fonttbl{\f0\froman\fcharse t0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Additional Text_2_Sent by me.rtf {\rtf1\ansi\ansicpg1252\uc1 \deff17\deflang1033\deflangfe1033{\fonttbl{\f0\fro man\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Additional Text_3_Sent by client.rtf {\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\de ff0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\defl ang1033\deflangfe1033{\fonttbl{\f0\froman\fcharset 0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Additional Text_4_Sent by me.rtf {\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\de ff0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\defl ang1033\deflangfe1033{\fonttbl{\f0\froman\fcharset 0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} As you can see, the first file from the client to me had "deff0". The file I sent back had "deff17", but the Swedish letters came out alright!!! They were also okay in file 3 and 4 ("deff0" in both cases). = = = = = = = = = Extracted Text from Bioplate PDF_1_Sent by client.rtf {\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\de ff0\stshfdbch13\stshfloch0\stshfhich0\stshfbi0\def lang1033\deflangfe2052{\fonttbl{\f0\froman\fcharse t0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Extracted Text from Bioplate PDF_2_Sent by me.rtf {\rtf1\ansi\ansicpg1252\uc1 \deff0\deflang1033\deflangfe1033{\fonttbl{\f0\from an\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Extracted Text from Bioplate PDF_3_Sent by client.rtf {\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\de ff0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\defl ang1033\deflangfe1033{\fonttbl{\f0\froman\fcharset 0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Extracted Text from Bioplate PDF_4_Sent by me.rtf (Swedish letter åäöÅÄÖ = Asian characters) {\rtf1\ansi\ansicpg1252\uc1 \deff17\deflang1033\deflangfe1033{\fonttbl{\f0\fro man\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} For this file, it was "deff0" for files 1-3, but file 4 had "deff17", and the Swedish letters were screwed up. - - - - - I don't know if it is possible for anyone to figure out what makes "deff'" switch from 0 to 17 (and sometimes to other numbers), and in one case having the Swedish letters come out okay and in the other not. But perhaps there are clues that will help someone figure out, at least in principle, what has happened, which might help me figure out how to avoid the problem. Thank you for your consideration. Hans L |
#2
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
On Jan 24, 9:29�am, Hans L wrote:
I have a seemingly random and vexing problem. �Sometimes, when I send an rtf or doc file to a client (translations), the Swedish letters ������ are replaced by Asian-looking characters. What usually happens is that "deff#" in the first line of the file is *not* "deff0", but, e.g., "deff17" (deff# is the font, I understand). �What I do not understand is why this happens. In the hope that someone with experience in these matters can provide a clue, or even a solution, I will present below the first row of rtf files I have looked at with UltraEdit (the text editor). �There are four files of each, and the file names explain who sent which file to whom. Additional Text_1_Sent by client.rtf {\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\de ff0\stshfdbch13\stshfloch 0\stshfhich0\stshfbi0\deflang1033\deflangfe2052{\f onttbl{\f0\froman\fcharse t0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Additional Text_2_Sent by me.rtf {\rtf1\ansi\ansicpg1252\uc1 \deff17\deflang1033\deflangfe1033{\fonttbl{\f0\fro man\fcharset0\fprq2{\*\pa nose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Additional Text_3_Sent by client.rtf {\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\de ff0\stshfdbch0\stshfloch0 \stshfhich0\stshfbi0\deflang1033\deflangfe1033{\fo nttbl{\f0\froman\fcharset 0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Additional Text_4_Sent by me.rtf {\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\de ff0\stshfdbch0\stshfloch0 \stshfhich0\stshfbi0\deflang1033\deflangfe1033{\fo nttbl{\f0\froman\fcharset 0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} As you can see, the first file from the client to me had "deff0". �The file I sent back had "deff17", but the Swedish letters came out alright!!! �They were also okay in file 3 and 4 ("deff0" in both cases). = = = = = = = = = Extracted Text from Bioplate PDF_1_Sent by client.rtf {\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\de ff0\stshfdbch13\stshfloch 0\stshfhich0\stshfbi0\deflang1033\deflangfe2052{\f onttbl{\f0\froman\fcharse t0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Extracted Text from Bioplate PDF_2_Sent by me.rtf {\rtf1\ansi\ansicpg1252\uc1 \deff0\deflang1033\deflangfe1033{\fonttbl{\f0\from an\fcharset0\fprq2{\*\pan ose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Extracted Text from Bioplate PDF_3_Sent by client.rtf {\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\de ff0\stshfdbch0\stshfloch0 \stshfhich0\stshfbi0\deflang1033\deflangfe1033{\fo nttbl{\f0\froman\fcharset 0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} Extracted Text from Bioplate PDF_4_Sent by me.rtf (Swedish letter ������ = Asian characters) {\rtf1\ansi\ansicpg1252\uc1 \deff17\deflang1033\deflangfe1033{\fonttbl{\f0\fro man\fcharset0\fprq2{\*\pa nose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;} For this file, it was "deff0" for files 1-3, but file 4 had "deff17", and the Swedish letters were screwed up. - - - - - I don't know if it is possible for anyone to figure out what makes "deff'" switch from 0 to 17 (and sometimes to other numbers), and in one case having the Swedish letters come out okay and in the other not. �But perhaps there are clues that will help someone figure out, at least in principle, what has happened, which might help me figure out how to avoid the problem. Thank you for your consideration. Hans L Seems to me like field code. Select the lines and press alt+f9 or alternatively right click in the sentence and select Toggle Field codes |
#3
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
I meant to tell that I am using Word 2000 & Win XP Home. I do not know what
my client uses. Also, I have searched the Net up and down, but cannot find any list over what fonts deff0, deff1, deff2, etc. stand for. Hans L |
#4
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
Hi Hans,
The deffN is the listing of default fonts used and matches to a font # in the font table \fonttbl in the document. Each version of Word gets a bit of an upgrade to the RTF spec. In theory, RTF 'readers' (like Word) are to ignore RTF elements that they don't know, but such is not always the case. For Word 2000 the RTF spec is v1.6, the first Unicode enabled version, and Word 2007 is v1.9. The \deff attributes have been in since version 1.0 (which the spec defined as being for use with "Microsoft MS-DOS(R), Windows(tm), OS/2(R), and Apple(R) Macintosh(R) applications" g. Unfotrunately since when showing the version # of RTF in the file, only the major revision is shown, all RTF documents start off with \RTF1 (all version 1 g). You'll find copies of each of the RTF specifications and tips on working with RTF on http://technet.microsoft.com and http://sourceforge.net While from just the snippets you provided I'm not sure all of the data is there, but in your last example the default language changed to U.S. English. Peter Jamieson should be by shortly. He's pretty much a wizard at reading RTF and may spot something else. =========== "Hans L" wrote in message ... I meant to tell that I am using Word 2000 & Win XP Home. I do not know what my client uses. Also, I have searched the Net up and down, but cannot find any list over what fonts deff0, deff1, deff2, etc. stand for. Hans L -- Bob Buckland ?:-) MS Office System Products MVP *Courtesy is not expensive and can pay big dividends* |
#5
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
Bob, I was just going to try to find a way to get in touch with MS whey I saw
your message (I do not get e-mail notifications, no matter what I do :-( Great to hear from you. All rtf text above is the entire first line when I look at the rtf files with UltraEdit (text editor). I will check out http://technet.microsoft.com and http://sourceforge.net, although I have already checked out http://msdn2.microsoft.com/en-us/lib...ffice.10).aspx without getting to much info (read: without understanding too much :-) I hope that you are right in that Peter Jamieson will come by. I am deeply over my head here, but I need to understand why these things happen, because it affects my livelihood. Thank you again, Hans L |
#6
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
Hi Hans,
To make things in your RTF file a bit easier to read in UltraEdit you may want to download the RTF WordList from http://ultraedit.com and then paste the content into Wordlist.txt (Advanced=Configuration=Syntax Highlighting [Open](using a copy of the file) Then, open a backup copy of your RTF file and in UE use Search=Replace Find What }{ Replace with }^p{ [those are curly braces in the example] (use a backup copy as the added paragraph breaks will show up as new paragraphs in Word if you reopen the RTF file. You may also want to find out what version of Word the person you're exchanging files with is using as well and what languages are enabled in his version. ========= "Hans L" wrote in message ... Bob, I was just going to try to find a way to get in touch with MS whey I saw your message (I do not get e-mail notifications, no matter what I do :-( Great to hear from you. All rtf text above is the entire first line when I look at the rtf files with UltraEdit (text editor). I will check out http://technet.microsoft.com and http://sourceforge.net, although I have already checked out http://msdn2.microsoft.com/en-us/lib...ffice.10).aspx without getting to much info (read: without understanding too much :-) I hope that you are right in that Peter Jamieson will come by. I am deeply over my head here, but I need to understand why these things happen, because it affects my livelihood. Thank you again, Hans L -- Bob Buckland ?:-) MS Office System Products MVP *Courtesy is not expensive and can pay big dividends* |
#7
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
"Bob Buckland ?:-)" wrote: Hi Hans, To make things in your RTF file a bit easier to read in UltraEdit you may want to download the RTF WordList from http://ultraedit.com and then paste the content into Wordlist.txt (Advanced=Configuration=Syntax Highlighting [Open](using a copy of the file) Then, open a backup copy of your RTF file and in UE use Search=Replace Find What }{ Replace with }^p{ [those are curly braces in the example] (use a backup copy as the added paragraph breaks will show up as new paragraphs in Word if you reopen the RTF file. You may also want to find out what version of Word the person you're exchanging files with is using as well and what languages are enabled in his version. Bob Buckland ?:-) MS Office System Products MVP Thanks for the advice, Bob Now, I have looked like crazy for RTF Wordlist on the IDM site, but I cannot find it. Is it possibly called something else or ...? I did ask the client for what version of Word they used, but have gotten no response yet. Enabled languages €“ hm, what is that going to tell me? Regards, Hans L |
#8
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
Hi Hans,
The RTF list format WordFile and taglists are on http://ultraedit.com/index.php?name=...howpage&pid=40 In some of your RTF file snippets there were multiple languages it appears in the content. The problem you're having with international characters could come from applying the wrong language setting to specific text in Word. ========= "Hans L" wrote in message ... Thanks for the advice, Bob Now, I have looked like crazy for RTF Wordlist on the IDM site, but I cannot find it. Is it possibly called something else or ...? I did ask the client for what version of Word they used, but have gotten no response yet. Enabled languages - hm, what is that going to tell me? Regards, Hans L -- Bob Buckland ?:-) MS Office System Products MVP *Courtesy is not expensive and can pay big dividends* |
#9
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
Okay, Bob, I have printed what I think I need, and I will study!
Thanks, Hans L |
#10
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
Okay, I did all the above, and I now get all f1-f... on different lines.
Helps a lot ! Here is what I have: In Additional Text_2_Sent by me.rtf. there is no "adeflang", while in all other Additional Text_#_Sent by me.rtf files, there is an "adefflang0125". I do not yet know what "adeflang" is. What is interesting is that i 2, "deff" is "f17", which normally would have come out as SimSun, but in this case, it did not!!! The Swedish letters are okay. This is contrary to Extracted Text from Bioplate PDF_4_Sent by me.rtf (Swedish letter åäöÅÄÖ = Asian characters), where "deff" is "f17" and the Swedish letters are indeed SimSun characters. The only clue I can see that might explain why I did not get SimSun in 2, but in 4, is that in 1: deflangfe2052 2: deflangfe1033 3: deflangfe1033 4: deflangfe1033 In other words, when the client sent me a file with deflangfe2052, my return file was okay, but when I got a deflangfe1033 back from the client, my return file became SimSun. Have no idea if this makes sense, and I do not know what deflange is (will check). Regard,s Hans L |
#11
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
Before I go to bed, here is a link that might be userful (not only to me):
http://msdn2.microsoft.com/en-us/lib...ffice.10).aspx Regards, Hans L |
#12
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
Hi Hans,
Yes, that was what I was suspecting is that switching the language ID to U.S. English (1033) for displaying FarEast characters (\deflangfe1033) might be what was contributing to you getting the incorrect results. Peter Jamieson has said he would be able to take a look at this thread. He's more 'fluent' in RTF (2052 is People's Republic of China), but both choices are a bit at odds to working with Swedish text characters (LCID 1053) g. \adeflang is the bidirection (alternate direction) language choice for a document, which if I recall would be more likely to appear in later versions of Word RTF than 2000. In your case \adeflang1025 would be for Arabic - Saudi Arabia. The appearance of some of the language coding in the document may not mean that it was intentionally used in the document but that Complex Script (Right to left) languages have been enabled in the copy of Word that was creating the document ==================== "Hans L" wrote in message ... Okay, I did all the above, and I now get all f1-f... on different lines. Helps a lot ! Here is what I have: In Additional Text_2_Sent by me.rtf. there is no "adeflang", while in all other Additional Text_#_Sent by me.rtf files, there is an "adefflang0125". I do not yet know what "adeflang" is. What is interesting is that i 2, "deff" is "f17", which normally would have come out as SimSun, but in this case, it did not!!! The Swedish letters are okay. This is contrary to Extracted Text from Bioplate PDF_4_Sent by me.rtf (Swedish letter åäöÅÄÖ = Asian characters), where "deff" is "f17" and the Swedish letters are indeed SimSun characters. The only clue I can see that might explain why I did not get SimSun in 2, but in 4, is that in 1: deflangfe2052 2: deflangfe1033 3: deflangfe1033 4: deflangfe1033 In other words, when the client sent me a file with deflangfe2052, my return file was okay, but when I got a deflangfe1033 back from the client, my return file became SimSun. Have no idea if this makes sense, and I do not know what deflange is (will check). Regard,s Hans L -- Bob Buckland ?:-) MS Office System Products MVP *Courtesy is not expensive and can pay big dividends* |
#13
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
Well, Bob, I hope we are on to something. I'll hold off a little to give
Peter time to look at this post (I really hope he will have the time) before I start trying to do anything to alleviate this problem. Thanks for your help so far! Hans L |
#14
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
Hello Hans and Bob,
Sorry it took me so long - not Bob's fault! Nor am I quite the whizz with RTF that you might have hoped :-) Just a few initial thoughts... Point one is that the font table starting \fonttbl simply assigns font numbers to a number of fonts defined using certain characteristics. In the remainder of the RTF document, the font defined as \deff17 will be referenced as \f17 and so on. But there is nothing magical about "17" itself, and the same font could be referenced by a different \deff number in different documents. Word does set up a number of fonts in the \fonttbl by default, and in practice they may well be invariant between different instances of Word, but if you open a document, change it, and save it, there is no reason why, in theory, Word might not completely reorganise the font table. So what does \deff17 in your Additional Text_4_Sent by me.rtf actually say? Second, when a Windows program such as Word tries to use a font in Windows using the "Windows GDI" (Graphics Device Interface) it selects a font based on a number of criteria, and interestingly enough, the "Facename" (Arial, Times New Roman) etc. is, or at least was, not the first in the list. This dates back to the time before TrueType etc. when fonts were typically tied to a very small pre-Unicode character set. I believe it searches using the following sequence: Character set Pitch Family (e.g. Decorative Modern, Roman, Script, Swiss) Facename I also see that in 3 out of the 4 files "sent by you" you have {\rtf1\ansi\ansicpg1252\uc1 rather than {\rtf1\adeflang1025\ansi\ansicpg1252\uc1 \adeflang is AFAIK an RTF 1.9 (Word 2007) keyword that specifies the "Default language ID for South Asian/Middle Eastern text in Word. The default languages are determined by the current primary editing language and the enabled editing languages (can be changed via Microsoft Office Language Settings applet)." So I would guess that this keyword is only added if the user is using Word 2007 or perhaps the compatibility pack. It could be that you are using Word 2003 or saving as Word 2003 compatible format. 1025 actually specifies "Arabic (Saudi Arabia)" I think. However, I do not know whether this setting will come into play unless text is marked explicitly as being in a South Asian/Middle Eastern language, and off the top of my head I can't tell you how that would be done in RTF. I think this is more to do with the /human language/ being used than the script. \deflang defines something similar for all text in the document marked as \plain: eg. deflang1033 is English (U.S.), and \deflangfe does a similar job for East Asian text - e.g. \deflangfe2052 is Chinese (PRC). I suppose it could be significant that one of the files sent to you has \deflangfe2052 and others have \deflangfe1033. -- Peter Jamieson http://tips.pjmsn.me.uk "Hans L" wrote in message ... Well, Bob, I hope we are on to something. I'll hold off a little to give Peter time to look at this post (I really hope he will have the time) before I start trying to do anything to alleviate this problem. Thanks for your help so far! Hans L |
#15
Posted to microsoft.public.word.docmanagement
|
|||
|
|||
First line in rtf file and font
Peter, I am very sorry that it took ME so long to get back here. I cannot
get notification to work consistently, not even when I use a newsreader (XanaNews) (although in this case, I used the web interface). I do not know how others remember what they post when they get no notification. I will print your post and go through it carefully, and then get back. Again, sorry for my lateness. Regards, Hans L |
Reply |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Font line spacing | Microsoft Word Help | |||
how to specify font for line numbers | Microsoft Word Help | |||
line spacing AROUND font | Microsoft Word Help | |||
How do I change the line numbers font to match text font? | Microsoft Word Help | |||
how to change the font of line numbers ? | Page Layout |