Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.word.mailmerge.fields
|
|||
|
|||
ODC connections - interpreting HTML
I have a question about ODC connections set up to mailmerge from a datafile
into Word. Is there entry you can put into the associated Schema.ini for this ODC connection to enable the interpretation of HTML tags so that when the data is merged into Word the HTML gets generated and does not show as just the HTML tags. My datafile looks like so: TABLE TABLETRTD WIDTH=80Course /TDTD WIDTH=200Title /TDTD WIDTH=60Level/TDTD WIDTH=60Points /TDTD WIDTH=60ECTS /TDTD WIDTH=100Result /TD/TR/TABLE where table is the header and the record is the HTML. My entry in the Schema.ini is [names_list.txt] Format=Delimited(}) ---- I have also found that if I was just to merge this data to a Word document without using this ODC, the HTML seems to get interpreted ok, and in the document a table gets created. This would be using the default text intrepretor set up in the Windows registry presumably. So would this be possible? Many thanks |
#2
Posted to microsoft.public.word.mailmerge.fields
|
|||
|
|||
ODC connections - interpreting HTML
I'm not sure you can use the text file provider + SCHEMA.INI for that -
MS's documentation suggests that the text file provider and the HTML provider are the same thing, but I think you have to specify different extended properties in yor connection string, i.e. Extended Properties=""HTML Import;HDR=YES;"";" or something like Extended Properties="HTML Import;HDR=YES;" in a .odc file. (The engine type, 112, is different from the text file engine type, 96, but in this case it seems that you need the extended properties rather than the engine type). If the file is also encoded usig a particular character set, I don't know how you would specify that. I was never sure how international character set stuff was handled in earlier HTML versions anyway. FWIW 1. the header row can be defined as trthfield1/ththfield2/ththetc./th/tr or trtdfield1/tdtdfield2/tdtdetc./td/tr 2. If you have more than one table, the provider use the name in any caption tag in the table if there is one, but for "uncaptioned" tables it seems to name the tables table table1 table2 etc. You may need to name the table in the .odc or in the SQLStatement parameter of the OpenDataSource call (e.g. SELECT * FROM [table] ) As far as I can see, this all works with simple examples but I've never used it "for real". You appear to need a .odc when the data gets more complex (perhaps if there is more than one table or you named a table with a caption), but you may be able to specify the .htm file as the Name parameter and put everything else in the Connection parameter (and possibly the SQLStatement paramter) in simpler cases. Peter Jamieson http://tips.pjmsn.me.uk Emlyn B wrote: I have a question about ODC connections set up to mailmerge from a datafile into Word. Is there entry you can put into the associated Schema.ini for this ODC connection to enable the interpretation of HTML tags so that when the data is merged into Word the HTML gets generated and does not show as just the HTML tags. My datafile looks like so: TABLE TABLETRTD WIDTH=80Course /TDTD WIDTH=200Title /TDTD WIDTH=60Level/TDTD WIDTH=60Points /TDTD WIDTH=60ECTS /TDTD WIDTH=100Result /TD/TR/TABLE where table is the header and the record is the HTML. My entry in the Schema.ini is [names_list.txt] Format=Delimited(}) ---- I have also found that if I was just to merge this data to a Word document without using this ODC, the HTML seems to get interpreted ok, and in the document a table gets created. This would be using the default text intrepretor set up in the Windows registry presumably. So would this be possible? Many thanks |
#3
Posted to microsoft.public.word.mailmerge.fields
|
|||
|
|||
ODC connections - interpreting HTML
Thanks for replying. However, I probably should have shown you what the full
file looks like: DATE_REPORT}RUN_ID}TEMPLATE_ID}TABLE_EXAMPLE 06 February 2009}18}DSUPP.DOC}TABLETRTD WIDTH=80Course /TDTD WIDTH=200Title /TDTD WIDTH=60Level/TDTD WIDTH=60Points /TDTD WIDTH=60ECTS /TDTD WIDTH=100Result /TD/TR/TABLE That is it is delimited by '}' and contains both text and html. So I want the merge process to read all these field and merge 06 February 2009 into the DATE_REPORT field and merge the html table into the TABLE EXAMPLE field. When I followed your suggestion, of putting HTML Import;HDR=YES into the extended properties, only the HTML table appears to be read. By the looks of things it is not possible. You can only have one or the other. However if I was to open the doc using the default Windows text converter, it seems to work ok. Is this possible? Thanks in advance. "Peter Jamieson" wrote: I'm not sure you can use the text file provider + SCHEMA.INI for that - MS's documentation suggests that the text file provider and the HTML provider are the same thing, but I think you have to specify different extended properties in yor connection string, i.e. Extended Properties=""HTML Import;HDR=YES;"";" or something like Extended Properties="HTML Import;HDR=YES;" in a .odc file. (The engine type, 112, is different from the text file engine type, 96, but in this case it seems that you need the extended properties rather than the engine type). If the file is also encoded usig a particular character set, I don't know how you would specify that. I was never sure how international character set stuff was handled in earlier HTML versions anyway. FWIW 1. the header row can be defined as trthfield1/ththfield2/ththetc./th/tr or trtdfield1/tdtdfield2/tdtdetc./td/tr 2. If you have more than one table, the provider use the name in any caption tag in the table if there is one, but for "uncaptioned" tables it seems to name the tables table table1 table2 etc. You may need to name the table in the .odc or in the SQLStatement parameter of the OpenDataSource call (e.g. SELECT * FROM [table] ) As far as I can see, this all works with simple examples but I've never used it "for real". You appear to need a .odc when the data gets more complex (perhaps if there is more than one table or you named a table with a caption), but you may be able to specify the .htm file as the Name parameter and put everything else in the Connection parameter (and possibly the SQLStatement paramter) in simpler cases. Peter Jamieson http://tips.pjmsn.me.uk Emlyn B wrote: I have a question about ODC connections set up to mailmerge from a datafile into Word. Is there entry you can put into the associated Schema.ini for this ODC connection to enable the interpretation of HTML tags so that when the data is merged into Word the HTML gets generated and does not show as just the HTML tags. My datafile looks like so: TABLE TABLETRTD WIDTH=80Course /TDTD WIDTH=200Title /TDTD WIDTH=60Level/TDTD WIDTH=60Points /TDTD WIDTH=60ECTS /TDTD WIDTH=100Result /TD/TR/TABLE where table is the header and the record is the HTML. My entry in the Schema.ini is [names_list.txt] Format=Delimited(}) ---- I have also found that if I was just to merge this data to a Word document without using this ODC, the HTML seems to get interpreted ok, and in the document a table gets created. This would be using the default text intrepretor set up in the Windows registry presumably. So would this be possible? Many thanks |
#4
Posted to microsoft.public.word.mailmerge.fields
|
|||
|
|||
ODC connections - interpreting HTML
I see what you mean. It's quite interesting that the converter
treats the file that way. However, so far I haven't seen a simple way to grab both the regular data and the HTML data, certainly not during a merge operation. Peter Jamieson http://tips.pjmsn.me.uk Emlyn B wrote: Thanks for replying. However, I probably should have shown you what the full file looks like: DATE_REPORT}RUN_ID}TEMPLATE_ID}TABLE_EXAMPLE 06 February 2009}18}DSUPP.DOC}TABLETRTD WIDTH=80Course /TDTD WIDTH=200Title /TDTD WIDTH=60Level/TDTD WIDTH=60Points /TDTD WIDTH=60ECTS /TDTD WIDTH=100Result /TD/TR/TABLE That is it is delimited by '}' and contains both text and html. So I want the merge process to read all these field and merge 06 February 2009 into the DATE_REPORT field and merge the html table into the TABLE EXAMPLE field. When I followed your suggestion, of putting HTML Import;HDR=YES into the extended properties, only the HTML table appears to be read. By the looks of things it is not possible. You can only have one or the other. However if I was to open the doc using the default Windows text converter, it seems to work ok. Is this possible? Thanks in advance. "Peter Jamieson" wrote: I'm not sure you can use the text file provider + SCHEMA.INI for that - MS's documentation suggests that the text file provider and the HTML provider are the same thing, but I think you have to specify different extended properties in yor connection string, i.e. Extended Properties=""HTML Import;HDR=YES;"";" or something like Extended Properties="HTML Import;HDR=YES;" in a .odc file. (The engine type, 112, is different from the text file engine type, 96, but in this case it seems that you need the extended properties rather than the engine type). If the file is also encoded usig a particular character set, I don't know how you would specify that. I was never sure how international character set stuff was handled in earlier HTML versions anyway. FWIW 1. the header row can be defined as trthfield1/ththfield2/ththetc./th/tr or trtdfield1/tdtdfield2/tdtdetc./td/tr 2. If you have more than one table, the provider use the name in any caption tag in the table if there is one, but for "uncaptioned" tables it seems to name the tables table table1 table2 etc. You may need to name the table in the .odc or in the SQLStatement parameter of the OpenDataSource call (e.g. SELECT * FROM [table] ) As far as I can see, this all works with simple examples but I've never used it "for real". You appear to need a .odc when the data gets more complex (perhaps if there is more than one table or you named a table with a caption), but you may be able to specify the .htm file as the Name parameter and put everything else in the Connection parameter (and possibly the SQLStatement paramter) in simpler cases. Peter Jamieson http://tips.pjmsn.me.uk Emlyn B wrote: I have a question about ODC connections set up to mailmerge from a datafile into Word. Is there entry you can put into the associated Schema.ini for this ODC connection to enable the interpretation of HTML tags so that when the data is merged into Word the HTML gets generated and does not show as just the HTML tags. My datafile looks like so: TABLE TABLETRTD WIDTH=80Course /TDTD WIDTH=200Title /TDTD WIDTH=60Level/TDTD WIDTH=60Points /TDTD WIDTH=60ECTS /TDTD WIDTH=100Result /TD/TR/TABLE where table is the header and the record is the HTML. My entry in the Schema.ini is [names_list.txt] Format=Delimited(}) ---- I have also found that if I was just to merge this data to a Word document without using this ODC, the HTML seems to get interpreted ok, and in the document a table gets created. This would be using the default text intrepretor set up in the Windows registry presumably. So would this be possible? Many thanks |
#5
Posted to microsoft.public.word.mailmerge.fields
|
|||
|
|||
ODC connections - interpreting HTML
OK, I have had a further look at this. Frankly, I think you will
probably be better off either trying to get your source data in another format, or if you are in a position to do so, writing code to transform it into something easier to deal with. If you go down that path, what is feasible depends on your coding capabilities - e.g. you could try to write Word VBA code that reads the file and generates the Word document you want, or you could write code that transforms the data into a format that makes it possible to produce output using simple techniques - e.g. transform it into two Access tables then use the Access reporter to generate your output, or if you have to use Word for the output, transform the data into a flat file that would then allow you to use the "one-to-many" catalog merge documented by macropod (have a look around this group for a pointer to that). Of course, the chances are that the data originates from a couple of tables in a database somewhere and that if you were able to start from that, you could avoid all these intermediate steps. The following idea are really so shaky I wouldn't dream of recommending them, but since I spent quite a lot of time looking at the problem I suppose I may as well write it up as someone may gain something from it. In order for this to have the slightest chance of working, your HTML must not contain any newlines/carriage returns etc. (which will break at least part of this), and assuming that you could have multiple rows in mixed.txt, each with an HTML table, there might be problems with number of rows and the length of the HTML. If you only ever have one row, you could simplify this. Suppose your file is called mixed.txt and is in c:\a a. Make a copy of mixed.txt called mixed.htm (even though it isn't a ..htm file) b. edit the appropriate SCHEMA.INI so that mixed.txt is defined as Delimiter(}) c. create a mixedtxt.odc that accesses mixed.txt as a text file - i.e. the Data Source is c:\a and the engine type is 96 d. create a mixedhtm.odc that accesses mixed.htm as a HTM file, i.e. the Data Source is c:\a\mixed.htm, Extended Properties="HTML Import;HDR=YES;"; and if you like, the Engine type is 112 e. use mixedtxt.odc as the data source for the merge. Don't try to insert the column that contains the table. f. insert a DATABASE field to insert mixedhtm.odc. At first it probably won't work, because Word may complain about the connection string. But reduce the field to { DATABASE \d "C:\\a\\mixedhtm.odc" \s "SELECT * FROM `Table`" \h } That successfully inserts the table from row 1 To get the field to insert from subsequent rows, insert a nested field , something like this: { DATABASE \d "C:\\a\\mixedhtm.odc" \s "SELECT * FROM `Table{ = { MERGEREC }-1 \#"0;0;''" }`" \h } In other words, if it's record 1, use "Table", if it's record 2, use "Table1" etc. Merge to a new document. Reformat all the tables as necessary. Peter Jamieson http://tips.pjmsn.me.uk Peter Jamieson wrote: I see what you mean. It's quite interesting that the converter treats the file that way. However, so far I haven't seen a simple way to grab both the regular data and the HTML data, certainly not during a merge operation. Peter Jamieson http://tips.pjmsn.me.uk Emlyn B wrote: Thanks for replying. However, I probably should have shown you what the full file looks like: DATE_REPORT}RUN_ID}TEMPLATE_ID}TABLE_EXAMPLE 06 February 2009}18}DSUPP.DOC}TABLETRTD WIDTH=80Course /TDTD WIDTH=200Title /TDTD WIDTH=60Level/TDTD WIDTH=60Points /TDTD WIDTH=60ECTS /TDTD WIDTH=100Result /TD/TR/TABLE That is it is delimited by '}' and contains both text and html. So I want the merge process to read all these field and merge 06 February 2009 into the DATE_REPORT field and merge the html table into the TABLE EXAMPLE field. When I followed your suggestion, of putting HTML Import;HDR=YES into the extended properties, only the HTML table appears to be read. By the looks of things it is not possible. You can only have one or the other. However if I was to open the doc using the default Windows text converter, it seems to work ok. Is this possible? Thanks in advance. "Peter Jamieson" wrote: I'm not sure you can use the text file provider + SCHEMA.INI for that - MS's documentation suggests that the text file provider and the HTML provider are the same thing, but I think you have to specify different extended properties in yor connection string, i.e. Extended Properties=""HTML Import;HDR=YES;"";" or something like Extended Properties="HTML Import;HDR=YES;" in a .odc file. (The engine type, 112, is different from the text file engine type, 96, but in this case it seems that you need the extended properties rather than the engine type). If the file is also encoded usig a particular character set, I don't know how you would specify that. I was never sure how international character set stuff was handled in earlier HTML versions anyway. FWIW 1. the header row can be defined as trthfield1/ththfield2/ththetc./th/tr or trtdfield1/tdtdfield2/tdtdetc./td/tr 2. If you have more than one table, the provider use the name in any caption tag in the table if there is one, but for "uncaptioned" tables it seems to name the tables table table1 table2 etc. You may need to name the table in the .odc or in the SQLStatement parameter of the OpenDataSource call (e.g. SELECT * FROM [table] ) As far as I can see, this all works with simple examples but I've never used it "for real". You appear to need a .odc when the data gets more complex (perhaps if there is more than one table or you named a table with a caption), but you may be able to specify the .htm file as the Name parameter and put everything else in the Connection parameter (and possibly the SQLStatement paramter) in simpler cases. Peter Jamieson http://tips.pjmsn.me.uk Emlyn B wrote: I have a question about ODC connections set up to mailmerge from a datafile into Word. Is there entry you can put into the associated Schema.ini for this ODC connection to enable the interpretation of HTML tags so that when the data is merged into Word the HTML gets generated and does not show as just the HTML tags. My datafile looks like so: TABLE TABLETRTD WIDTH=80Course /TDTD WIDTH=200Title /TDTD WIDTH=60Level/TDTD WIDTH=60Points /TDTD WIDTH=60ECTS /TDTD WIDTH=100Result /TD/TR/TABLE where table is the header and the record is the HTML. My entry in the Schema.ini is [names_list.txt] Format=Delimited(}) ---- I have also found that if I was just to merge this data to a Word document without using this ODC, the HTML seems to get interpreted ok, and in the document a table gets created. This would be using the default text intrepretor set up in the Windows registry presumably. So would this be possible? Many thanks |
#6
Posted to microsoft.public.word.mailmerge.fields
|
|||
|
|||
ODC connections - interpreting HTML
Thanks for these recommendations. I will try them.
"Peter Jamieson" wrote: OK, I have had a further look at this. Frankly, I think you will probably be better off either trying to get your source data in another format, or if you are in a position to do so, writing code to transform it into something easier to deal with. If you go down that path, what is feasible depends on your coding capabilities - e.g. you could try to write Word VBA code that reads the file and generates the Word document you want, or you could write code that transforms the data into a format that makes it possible to produce output using simple techniques - e.g. transform it into two Access tables then use the Access reporter to generate your output, or if you have to use Word for the output, transform the data into a flat file that would then allow you to use the "one-to-many" catalog merge documented by macropod (have a look around this group for a pointer to that). Of course, the chances are that the data originates from a couple of tables in a database somewhere and that if you were able to start from that, you could avoid all these intermediate steps. The following idea are really so shaky I wouldn't dream of recommending them, but since I spent quite a lot of time looking at the problem I suppose I may as well write it up as someone may gain something from it. In order for this to have the slightest chance of working, your HTML must not contain any newlines/carriage returns etc. (which will break at least part of this), and assuming that you could have multiple rows in mixed.txt, each with an HTML table, there might be problems with number of rows and the length of the HTML. If you only ever have one row, you could simplify this. Suppose your file is called mixed.txt and is in c:\a a. Make a copy of mixed.txt called mixed.htm (even though it isn't a ..htm file) b. edit the appropriate SCHEMA.INI so that mixed.txt is defined as Delimiter(}) c. create a mixedtxt.odc that accesses mixed.txt as a text file - i.e. the Data Source is c:\a and the engine type is 96 d. create a mixedhtm.odc that accesses mixed.htm as a HTM file, i.e. the Data Source is c:\a\mixed.htm, Extended Properties="HTML Import;HDR=YES;"; and if you like, the Engine type is 112 e. use mixedtxt.odc as the data source for the merge. Don't try to insert the column that contains the table. f. insert a DATABASE field to insert mixedhtm.odc. At first it probably won't work, because Word may complain about the connection string. But reduce the field to { DATABASE \d "C:\\a\\mixedhtm.odc" \s "SELECT * FROM `Table`" \h } That successfully inserts the table from row 1 To get the field to insert from subsequent rows, insert a nested field , something like this: { DATABASE \d "C:\\a\\mixedhtm.odc" \s "SELECT * FROM `Table{ = { MERGEREC }-1 \#"0;0;''" }`" \h } In other words, if it's record 1, use "Table", if it's record 2, use "Table1" etc. Merge to a new document. Reformat all the tables as necessary. Peter Jamieson http://tips.pjmsn.me.uk Peter Jamieson wrote: I see what you mean. It's quite interesting that the converter treats the file that way. However, so far I haven't seen a simple way to grab both the regular data and the HTML data, certainly not during a merge operation. Peter Jamieson http://tips.pjmsn.me.uk Emlyn B wrote: Thanks for replying. However, I probably should have shown you what the full file looks like: DATE_REPORT}RUN_ID}TEMPLATE_ID}TABLE_EXAMPLE 06 February 2009}18}DSUPP.DOC}TABLETRTD WIDTH=80Course /TDTD WIDTH=200Title /TDTD WIDTH=60Level/TDTD WIDTH=60Points /TDTD WIDTH=60ECTS /TDTD WIDTH=100Result /TD/TR/TABLE That is it is delimited by '}' and contains both text and html. So I want the merge process to read all these field and merge 06 February 2009 into the DATE_REPORT field and merge the html table into the TABLE EXAMPLE field. When I followed your suggestion, of putting HTML Import;HDR=YES into the extended properties, only the HTML table appears to be read. By the looks of things it is not possible. You can only have one or the other. However if I was to open the doc using the default Windows text converter, it seems to work ok. Is this possible? Thanks in advance. "Peter Jamieson" wrote: I'm not sure you can use the text file provider + SCHEMA.INI for that - MS's documentation suggests that the text file provider and the HTML provider are the same thing, but I think you have to specify different extended properties in yor connection string, i.e. Extended Properties=""HTML Import;HDR=YES;"";" or something like Extended Properties="HTML Import;HDR=YES;" in a .odc file. (The engine type, 112, is different from the text file engine type, 96, but in this case it seems that you need the extended properties rather than the engine type). If the file is also encoded usig a particular character set, I don't know how you would specify that. I was never sure how international character set stuff was handled in earlier HTML versions anyway. FWIW 1. the header row can be defined as trthfield1/ththfield2/ththetc./th/tr or trtdfield1/tdtdfield2/tdtdetc./td/tr 2. If you have more than one table, the provider use the name in any caption tag in the table if there is one, but for "uncaptioned" tables it seems to name the tables table table1 table2 etc. You may need to name the table in the .odc or in the SQLStatement parameter of the OpenDataSource call (e.g. SELECT * FROM [table] ) As far as I can see, this all works with simple examples but I've never used it "for real". You appear to need a .odc when the data gets more complex (perhaps if there is more than one table or you named a table with a caption), but you may be able to specify the .htm file as the Name parameter and put everything else in the Connection parameter (and possibly the SQLStatement paramter) in simpler cases. Peter Jamieson http://tips.pjmsn.me.uk Emlyn B wrote: I have a question about ODC connections set up to mailmerge from a datafile into Word. Is there entry you can put into the associated Schema.ini for this ODC connection to enable the interpretation of HTML tags so that when the data is merged into Word the HTML gets generated and does not show as just the HTML tags. My datafile looks like so: TABLE TABLETRTD WIDTH=80Course /TDTD WIDTH=200Title /TDTD WIDTH=60Level/TDTD WIDTH=60Points /TDTD WIDTH=60ECTS /TDTD WIDTH=100Result /TD/TR/TABLE where table is the header and the record is the HTML. My entry in the Schema.ini is [names_list.txt] Format=Delimited(}) ---- I have also found that if I was just to merge this data to a Word document without using this ODC, the HTML seems to get interpreted ok, and in the document a table gets created. This would be using the default text intrepretor set up in the Windows registry presumably. So would this be possible? Many thanks |
Reply |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
DDE Connections | Mailmerge | |||
Word 2002 lost connections | Microsoft Word Help | |||
Exporting Network connections from XP-Pro to Vista | New Users | |||
html vs htm | New Users | |||
What is HTML? | Microsoft Word Help |