#1   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

The latest estimate is that I'll have my new computer (with Office 07)
this afternoon ...

Is there a way to convert an existing bibliography, i.e. formatted
list of references, into a Word 2007 table of sources (or whatever
it's called) -- are they, like, like Excel tables, or (heaven forfend)
Access tables, so that something along the lines of tab delimiters
might work?
  #2   Report Post  
Posted to microsoft.public.word.docmanagement
p0 p0 is offline
external usenet poster
 
Posts: 254
Default text to bibliography?

On 14 aug, 13:53, grammatim wrote:
The latest estimate is that I'll have my new computer (with Office 07)
this afternoon ...

Is there a way to convert an existingbibliography, i.e. formatted
list of references, into a Word 2007 table of sources (or whatever
it's called) -- are they, like, like Excel tables, or (heaven forfend)
Access tables, so that something along the lines of tab delimiters
might work?


I'm not really sure what you are trying to accomplish.

Some background: in Word 2007, bibliographic entries are actually
stored inside a custom XML file in the docx. The file, often called
item1.xml, has the following format:

b:Sources SelectedStyle="\something.xsl" StyleName="A style called
something" xmlns:b="http://schemas.openxmlformats.org/officeDocument/
2006/bibliography" xmlns="http://schemas.openxmlformats.org/
officeDocument/2006/bibliography"
b:Source...b:Source
b:Source...b:Source
b:Source...b:Source
/b:Sources

where every b:Source element represents a bibliographic source. For a
description of the content of a b:Source element, you can check out
section 7.6 of http://www.ecma-international.org/pu...04%20(PDF).zip

Then, when the sources need to be displayed, one of the stylesheets
with the different bibliographic styles (APA, MLA, ...) gets a piece
of XML containing one or more b:Source elements and outputs a piece of
HTML. That HTML is then displayed in Word 2007 as your in-text
citation or bibliography.

Doing the reverse operation (at least that's what I think you try to
do) is not possible. You could try to create your own parser for that
but it seems overly complex to me. How would you expect a parser to be
able to identify the type of a bibliographic entry: Book, BookSection,
JournalArticle, ArticleInAPeriodical, ... And how would you make the
difference between the different contributors of a work: Author,
Artist, Editor, Translator, Writer, Producer, Performer, ...

If references to a work are available online somewhere, it might be
possible to more easily get them that way according to the following
article http://savas.parastatidis.name/2007/...5f173b55c.aspx.
However, I have never seen any code actually doing this and as far as
I know, Microsoft pulled the plug on their academic search.

BR,

Yves
--
http://www.codeplex.com/bibliography
  #3   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

I have, of course, many articles with bibliographies, and it would be
a lot quicker to insert tab delimiters -- or even a code -- before
each element in each entry in the list, so that the bibliography
database could be created/added to (as if a computer were involved)
without retyping every entry in toto

I should hope the software is smart enough to realize that the
"Author" of a journal article is the same sort of beast as the
"Author" of a book!

On Aug 14, 10:02 am, p0 wrote:
On 14 aug, 13:53, grammatim wrote:

The latest estimate is that I'll have my new computer (with Office 07)
this afternoon ...


Is there a way to convert an existingbibliography, i.e. formatted
list of references, into a Word 2007 table of sources (or whatever
it's called) -- are they, like, like Excel tables, or (heaven forfend)
Access tables, so that something along the lines of tab delimiters
might work?


I'm not really sure what you are trying to accomplish.

Some background: in Word 2007, bibliographic entries are actually
stored inside a custom XML file in the docx. The file, often called
item1.xml, has the following format:

b:Sources SelectedStyle="\something.xsl" StyleName="A style called
something" xmlns:b="http://schemas.openxmlformats.org/officeDocument/
2006/bibliography" xmlns="http://schemas.openxmlformats.org/
officeDocument/2006/bibliography"
b:Source...b:Source
b:Source...b:Source
b:Source...b:Source
/b:Sources

where every b:Source element represents a bibliographic source. For a
description of the content of a b:Source element, you can check out
section 7.6 of http://www.ecma-international.org/pu...MA-ST/Office%2...

Then, when the sources need to be displayed, one of the stylesheets
with the different bibliographic styles (APA, MLA, ...) gets a piece
of XML containing one or more b:Source elements and outputs a piece of
HTML. That HTML is then displayed in Word 2007 as your in-text
citation or bibliography.

Doing the reverse operation (at least that's what I think you try to
do) is not possible. You could try to create your own parser for that
but it seems overly complex to me. How would you expect a parser to be
able to identify the type of a bibliographic entry: Book, BookSection,
JournalArticle, ArticleInAPeriodical, ... And how would you make the
difference between the different contributors of a work: Author,
Artist, Editor, Translator, Writer, Producer, Performer, ...

If references to a work are available online somewhere, it might be
possible to more easily get them that way according to the following
articlehttp://savas.parastatidis.name/2007/01/25/595c0ffb-6595-41bb-9f81-952....
However, I have never seen any code actually doing this and as far as
I know, Microsoft pulled the plug on their academic search.

BR,

Yves
--http://www.codeplex.com/bibliography


  #4   Report Post  
Posted to microsoft.public.word.docmanagement
p0 p0 is offline
external usenet poster
 
Posts: 254
Default text to bibliography?

The software does not support what you want. And I think it is
doubtful it ever will.

An "author" of a book is not necessarely represented by a b:Author/
b:Author element in your XML. If the book would be an edited book,
than the author would be a b:Author/b:Editor element. If the book
would be a translated work, the author would be a b:Author/
b:Translator element. So there are several options. And what if the
book would be replaced with a film. Then you wouldn't have an author
at all, you would have performers, directors, writers, and producers.

You adding a code in front of every element of every entry is pretty
much the same as putting each element in between xml tags. So I guess
you could create the database yourself. You can always create a book
and a journal article through Word 2007 and study the resulting
sources.xml (located at %appdata%\Microsoft\Bibliography) and then
copy paste all other data into that xml file. That should be a bit
faster than copy/pasting everything into a source form for every
entry.

Yves

grammatim schreef:
I have, of course, many articles with bibliographies, and it would be
a lot quicker to insert tab delimiters -- or even a code -- before
each element in each entry in the list, so that the bibliography
database could be created/added to (as if a computer were involved)
without retyping every entry in toto

I should hope the software is smart enough to realize that the
"Author" of a journal article is the same sort of beast as the
"Author" of a book!

On Aug 14, 10:02 am, p0 wrote:
On 14 aug, 13:53, grammatim wrote:

The latest estimate is that I'll have my new computer (with Office 07)
this afternoon ...


Is there a way to convert an existingbibliography, i.e. formatted
list of references, into a Word 2007 table of sources (or whatever
it's called) -- are they, like, like Excel tables, or (heaven forfend)
Access tables, so that something along the lines of tab delimiters
might work?


I'm not really sure what you are trying to accomplish.

Some background: in Word 2007, bibliographic entries are actually
stored inside a custom XML file in the docx. The file, often called
item1.xml, has the following format:

b:Sources SelectedStyle="\something.xsl" StyleName="A style called
something" xmlns:b="http://schemas.openxmlformats.org/officeDocument/
2006/bibliography" xmlns="http://schemas.openxmlformats.org/
officeDocument/2006/bibliography"
b:Source...b:Source
b:Source...b:Source
b:Source...b:Source
/b:Sources

where every b:Source element represents a bibliographic source. For a
description of the content of a b:Source element, you can check out
section 7.6 of http://www.ecma-international.org/pu...MA-ST/Office%2...

Then, when the sources need to be displayed, one of the stylesheets
with the different bibliographic styles (APA, MLA, ...) gets a piece
of XML containing one or more b:Source elements and outputs a piece of
HTML. That HTML is then displayed in Word 2007 as your in-text
citation or bibliography.

Doing the reverse operation (at least that's what I think you try to
do) is not possible. You could try to create your own parser for that
but it seems overly complex to me. How would you expect a parser to be
able to identify the type of a bibliographic entry: Book, BookSection,
JournalArticle, ArticleInAPeriodical, ... And how would you make the
difference between the different contributors of a work: Author,
Artist, Editor, Translator, Writer, Producer, Performer, ...

If references to a work are available online somewhere, it might be
possible to more easily get them that way according to the following
articlehttp://savas.parastatidis.name/2007/01/25/595c0ffb-6595-41bb-9f81-952....
However, I have never seen any code actually doing this and as far as
I know, Microsoft pulled the plug on their academic search.

BR,

Yves
--http://www.codeplex.com/bibliography

  #5   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

On Aug 14, 6:34 pm, p0 wrote:
The software does not support what you want. And I think it is
doubtful it ever will.

An "author" of a book is not necessarely represented by a b:Author/
b:Author element in your XML.


You seem to be overlooking the bit where I said this could be done by
inserting a code, which would still be vastly preferable to retyping
the entire contents of a full subject bibliography.

If the book would be an edited book,
than the author would be a b:Author/b:Editor element. If the book
would be a translated work, the author would be a b:Author/
b:Translator element.


If it were a proper relational database, then there would be a list of
"names," and in any particular instance, a name could be an "author,"
an "editor," a "translator," or even some combination of the above.

So there are several options. And what if the
book would be replaced with a film. Then you wouldn't have an author
at all, you would have performers, directors, writers, and producers.


Each one of which is a "name," and which, again, could be associated
with several of the above categories.

You adding a code in front of every element of every entry is pretty
much the same as putting each element in between xml tags. So I guess
you could create the database yourself. You can always create a book
and a journal article through Word 2007 and study the resulting
sources.xml (located at %appdata%\Microsoft\Bibliography) and then
copy paste all other data into that xml file. That should be a bit
faster than copy/pasting everything into a source form for every
entry.

Yves

grammatim schreef:

I have, of course, many articles with bibliographies, and it would be
a lot quicker to insert tab delimiters -- or even a code -- before
each element in each entry in the list, so that the bibliography
database could be created/added to (as if a computer were involved)
without retyping every entry in toto


I should hope the software is smart enough to realize that the
"Author" of a journal article is the same sort of beast as the
"Author" of a book!


On Aug 14, 10:02 am, p0 wrote:
On 14 aug, 13:53, grammatim wrote:


The latest estimate is that I'll have my new computer (with Office 07)
this afternoon ...


Is there a way to convert an existingbibliography, i.e. formatted
list of references, into a Word 2007 table of sources (or whatever
it's called) -- are they, like, like Excel tables, or (heaven forfend)
Access tables, so that something along the lines of tab delimiters
might work?


I'm not really sure what you are trying to accomplish.


Some background: in Word 2007, bibliographic entries are actually
stored inside a custom XML file in the docx. The file, often called
item1.xml, has the following format:


b:Sources SelectedStyle="\something.xsl" StyleName="A style called
something" xmlns:b="http://schemas.openxmlformats.org/officeDocument/
2006/bibliography" xmlns="http://schemas.openxmlformats.org/
officeDocument/2006/bibliography"
b:Source...b:Source
b:Source...b:Source
b:Source...b:Source
/b:Sources


where every b:Source element represents a bibliographic source. For a
description of the content of a b:Source element, you can check out
section 7.6 of http://www.ecma-international.org/pu...MA-ST/Office%2...


Then, when the sources need to be displayed, one of the stylesheets
with the different bibliographic styles (APA, MLA, ...) gets a piece
of XML containing one or more b:Source elements and outputs a piece of
HTML. That HTML is then displayed in Word 2007 as your in-text
citation or bibliography.


Doing the reverse operation (at least that's what I think you try to
do) is not possible. You could try to create your own parser for that
but it seems overly complex to me. How would you expect a parser to be
able to identify the type of a bibliographic entry: Book, BookSection,
JournalArticle, ArticleInAPeriodical, ... And how would you make the
difference between the different contributors of a work: Author,
Artist, Editor, Translator, Writer, Producer, Performer, ...


If references to a work are available online somewhere, it might be
possible to more easily get them that way according to the following
articlehttp://savas.parastatidis.name/2007/01/25/595c0ffb-6595-41bb-9f81-952....
However, I have never seen any code actually doing this and as far as
I know, Microsoft pulled the plug on their academic search.


BR,


Yves
--http://www.codeplex.com/bibliography




  #6   Report Post  
Posted to microsoft.public.word.docmanagement
p0 p0 is offline
external usenet poster
 
Posts: 254
Default text to bibliography?



grammatim schreef:
On Aug 14, 6:34 pm, p0 wrote:
The software does not support what you want. And I think it is
doubtful it ever will.

An "author" of a book is not necessarely represented by a b:Author/
b:Author element in your XML.


You seem to be overlooking the bit where I said this could be done by
inserting a code, which would still be vastly preferable to retyping
the entire contents of a full subject bibliography.


I was talking about automatically processing lists without codes. The
second part of my reply stated that as soon as you start using codes,
you can just as well use xml tags and there would be no point I doing
anything automatically.

If the book would be an edited book,
than the author would be a b:Author/b:Editor element. If the book
would be a translated work, the author would be a b:Author/
b:Translator element.


If it were a proper relational database, then there would be a list of
"names," and in any particular instance, a name could be an "author,"
an "editor," a "translator," or even some combination of the above.


I am not a specialist when it comes to relational database, but I
agree that the current layout is not in full normal form. However the
way the names are stored seems to be similar as with other programs
(EndNote - http://www.endnote.com/support/helpdocs/endnote.zip).

Personally, I also see no gain in going for a relational database in
full normal form where names (b:Person elements) are put in a separate
list. For starters, if you would share the same names across multiple
sources, you would have to create a two-way link: source to one or
more names, and name to one or more sources. The reason for the first
link is obvious: indicating which names participated in the source.
The second link is necessary in case you would remove a source. You
would have to know if a name became obsolete or not. I would suspect
the overhead being greater than the benefits for this case.

Alternatively, if you would not share the names across sources, and
just use them within one source, it is highly unlikely that one name
will be used multiple times. There are exceptions such as the author
of a book section also being the editor of the book, but in most cases
the 'database' for a single source would actually be larger and
require more processing to obtain the same result.

So there are several options. And what if the
book would be replaced with a film. Then you wouldn't have an author
at all, you would have performers, directors, writers, and producers.


Each one of which is a "name," and which, again, could be associated
with several of the above categories.

You adding a code in front of every element of every entry is pretty
much the same as putting each element in between xml tags. So I guess
you could create the database yourself. You can always create a book
and a journal article through Word 2007 and study the resulting
sources.xml (located at %appdata%\Microsoft\Bibliography) and then
copy paste all other data into that xml file. That should be a bit
faster than copy/pasting everything into a source form for every
entry.

Yves

grammatim schreef:

I have, of course, many articles with bibliographies, and it would be
a lot quicker to insert tab delimiters -- or even a code -- before
each element in each entry in the list, so that the bibliography
database could be created/added to (as if a computer were involved)
without retyping every entry in toto


I should hope the software is smart enough to realize that the
"Author" of a journal article is the same sort of beast as the
"Author" of a book!


On Aug 14, 10:02 am, p0 wrote:
On 14 aug, 13:53, grammatim wrote:


The latest estimate is that I'll have my new computer (with Office 07)
this afternoon ...


Is there a way to convert an existingbibliography, i.e. formatted
list of references, into a Word 2007 table of sources (or whatever
it's called) -- are they, like, like Excel tables, or (heaven forfend)
Access tables, so that something along the lines of tab delimiters
might work?


I'm not really sure what you are trying to accomplish.


Some background: in Word 2007, bibliographic entries are actually
stored inside a custom XML file in the docx. The file, often called
item1.xml, has the following format:


b:Sources SelectedStyle="\something.xsl" StyleName="A style called
something" xmlns:b="http://schemas.openxmlformats.org/officeDocument/
2006/bibliography" xmlns="http://schemas.openxmlformats.org/
officeDocument/2006/bibliography"
b:Source...b:Source
b:Source...b:Source
b:Source...b:Source
/b:Sources


where every b:Source element represents a bibliographic source. For a
description of the content of a b:Source element, you can check out
section 7.6 of http://www.ecma-international.org/pu...MA-ST/Office%2...


Then, when the sources need to be displayed, one of the stylesheets
with the different bibliographic styles (APA, MLA, ...) gets a piece
of XML containing one or more b:Source elements and outputs a piece of
HTML. That HTML is then displayed in Word 2007 as your in-text
citation or bibliography.


Doing the reverse operation (at least that's what I think you try to
do) is not possible. You could try to create your own parser for that
but it seems overly complex to me. How would you expect a parser to be
able to identify the type of a bibliographic entry: Book, BookSection,
JournalArticle, ArticleInAPeriodical, ... And how would you make the
difference between the different contributors of a work: Author,
Artist, Editor, Translator, Writer, Producer, Performer, ...


If references to a work are available online somewhere, it might be
possible to more easily get them that way according to the following
articlehttp://savas.parastatidis.name/2007/01/25/595c0ffb-6595-41bb-9f81-952....
However, I have never seen any code actually doing this and as far as
I know, Microsoft pulled the plug on their academic search.


BR,


Yves
--http://www.codeplex.com/bibliography

  #7   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

On Aug 15, 7:28 pm, p0 wrote:
grammatim schreef:

On Aug 14, 6:34 pm, p0 wrote:
The software does not support what you want. And I think it is
doubtful it ever will.


An "author" of a book is not necessarely represented by a b:Author/
b:Author element in your XML.


You seem to be overlooking the bit where I said this could be done by
inserting a code, which would still be vastly preferable to retyping
the entire contents of a full subject bibliography.


I was talking about automatically processing lists without codes. The
second part of my reply stated that as soon as you start using codes,
you can just as well use xml tags and there would be no point I doing
anything automatically.


How is an "xml tag" not a code?

How is typing (or pasting) a few-letter code not better than typing
entire names, titles, etc.?

If the book would be an edited book,
than the author would be a b:Author/b:Editor element. If the book
would be a translated work, the author would be a b:Author/
b:Translator element.


If it were a proper relational database, then there would be a list of
"names," and in any particular instance, a name could be an "author,"
an "editor," a "translator," or even some combination of the above.


I am not a specialist when it comes to relational database, but I
agree that the current layout is not in full normal form. However the
way the names are stored seems to be similar as with other programs
(EndNote -http://www.endnote.com/support/helpdocs/endnote.zip).

Personally, I also see no gain in going for a relational database in
full normal form where names (b:Person elements) are put in a separate
list. For starters, if you would share the same names across multiple
sources, you would have to create a two-way link: source to one or
more names, and name to one or more sources. The reason for the first
link is obvious: indicating which names participated in the source.
The second link is necessary in case you would remove a source. You
would have to know if a name became obsolete or not. I would suspect
the overhead being greater than the benefits for this case.


Not sure why I'd remove a source ...

Alternatively, if you would not share the names across sources, and
just use them within one source, it is highly unlikely that one name
will be used multiple times. There are exceptions such as the author
of a book section also being the editor of the book, but in most cases
the 'database' for a single source would actually be larger and
require more processing to obtain the same result.

So there are several options. And what if the
book would be replaced with a film. Then you wouldn't have an author
at all, you would have performers, directors, writers, and producers.


Each one of which is a "name," and which, again, could be associated
with several of the above categories.


You adding a code in front of every element of every entry is pretty
much the same as putting each element in between xml tags. So I guess
you could create the database yourself. You can always create a book
and a journal article through Word 2007 and study the resulting
sources.xml (located at %appdata%\Microsoft\Bibliography) and then
copy paste all other data into that xml file. That should be a bit
faster than copy/pasting everything into a source form for every
entry.


Yves


grammatim schreef:


I have, of course, many articles with bibliographies, and it would be
a lot quicker to insert tab delimiters -- or even a code -- before
each element in each entry in the list, so that the bibliography
database could be created/added to (as if a computer were involved)
without retyping every entry in toto


I should hope the software is smart enough to realize that the
"Author" of a journal article is the same sort of beast as the
"Author" of a book!


On Aug 14, 10:02 am, p0 wrote:
On 14 aug, 13:53, grammatim wrote:


The latest estimate is that I'll have my new computer (with Office 07)
this afternoon ...


Is there a way to convert an existingbibliography, i.e. formatted
list of references, into a Word 2007 table of sources (or whatever
it's called) -- are they, like, like Excel tables, or (heaven forfend)
Access tables, so that something along the lines of tab delimiters
might work?


I'm not really sure what you are trying to accomplish.


Some background: in Word 2007, bibliographic entries are actually
stored inside a custom XML file in the docx. The file, often called
item1.xml, has the following format:


b:Sources SelectedStyle="\something.xsl" StyleName="A style called
something" xmlns:b="http://schemas.openxmlformats.org/officeDocument/
2006/bibliography" xmlns="http://schemas.openxmlformats.org/
officeDocument/2006/bibliography"
b:Source...b:Source
b:Source...b:Source
b:Source...b:Source
/b:Sources


where every b:Source element represents a bibliographic source. For a
description of the content of a b:Source element, you can check out
section 7.6 of http://www.ecma-international.org/pu...MA-ST/Office%2...


Then, when the sources need to be displayed, one of the stylesheets
with the different bibliographic styles (APA, MLA, ...) gets a piece
of XML containing one or more b:Source elements and outputs a piece of
HTML. That HTML is then displayed in Word 2007 as your in-text
citation or bibliography.


Doing the reverse operation (at least that's what I think you try to
do) is not possible. You could try to create your own parser for that
but it seems overly complex to me. How would you expect a parser to be
able to identify the type of a bibliographic entry: Book, BookSection,
JournalArticle, ArticleInAPeriodical, ... And how would you make the
difference between the different contributors of a work: Author,
Artist, Editor, Translator, Writer, Producer, Performer, ...


If references to a work are available online somewhere, it might be
possible to more easily get them that way according to the following
articlehttp://savas.parastatidis.name/2007/01/25/595c0ffb-6595-41bb-9f81-952....
However, I have never seen any code actually doing this and as far as
I know, Microsoft pulled the plug on their academic search.


BR,


Yves
--http://www.codeplex.com/bibliography


  #8   Report Post  
Posted to microsoft.public.word.docmanagement
p0 p0 is offline
external usenet poster
 
Posts: 254
Default text to bibliography?



grammatim schreef:
On Aug 15, 7:28 pm, p0 wrote:
grammatim schreef:

On Aug 14, 6:34 pm, p0 wrote:
The software does not support what you want. And I think it is
doubtful it ever will.


An "author" of a book is not necessarely represented by a b:Author/
b:Author element in your XML.


You seem to be overlooking the bit where I said this could be done by
inserting a code, which would still be vastly preferable to retyping
the entire contents of a full subject bibliography.


I was talking about automatically processing lists without codes. The
second part of my reply stated that as soon as you start using codes,
you can just as well use xml tags and there would be no point I doing
anything automatically.


How is an "xml tag" not a code?


It is a code, the point is, once you add the xml tags, there is no
more need for processing. You have done the entire processing by hand.
All that is left to do is copy/pasting the sources into the database
and you are done.

How is typing (or pasting) a few-letter code not better than typing
entire names, titles, etc.?


Xml was intended to be human readable. The disadvantage is that tags
tend to be long, the major advantage is that you don't have to learn a
few dozen non-descriptive codes by heart (would @c be city, comments,
chapternumber, conferencename, country, court, or casenumber?). Also
the use of closing tags is a good way in getting around punctuation
issues.

The following simple example:

John Doe. "A book by myself". London, 2008.

would result in the following string with xml 'codes':

b:Sourceb:SourceTypeBook/
b:SourceTypeb:Authorb:Authorb:NameListb:Pe rsonb:FirstJohn/
b:Firstb:LastDoe/b:Last/b:Person/b:NameList/b:Author/
b:Authorb:Garbage. "/b:Garbageb:TitleA book by myself/
b:Titleb:Garbage". /b:Garbageb:CityLondon/b:Cityb:Garbage,
/b:Garbageb:Year2008/b:Yearb:Garbage./b:Garbageb:Source

where b:Garbage elements are meaningless elements and should be
removed from the source at a later stage. Do you really think typing/
copying/pasting out all those 'codes' is easier then copy/pasting 5
elements into a form?

If the book would be an edited book,
than the author would be a b:Author/b:Editor element. If the book
would be a translated work, the author would be a b:Author/
b:Translator element.


If it were a proper relational database, then there would be a list of
"names," and in any particular instance, a name could be an "author,"
an "editor," a "translator," or even some combination of the above.


I am not a specialist when it comes to relational database, but I
agree that the current layout is not in full normal form. However the
way the names are stored seems to be similar as with other programs
(EndNote -http://www.endnote.com/support/helpdocs/endnote.zip).

Personally, I also see no gain in going for a relational database in
full normal form where names (b:Person elements) are put in a separate
list. For starters, if you would share the same names across multiple
sources, you would have to create a two-way link: source to one or
more names, and name to one or more sources. The reason for the first
link is obvious: indicating which names participated in the source.
The second link is necessary in case you would remove a source. You
would have to know if a name became obsolete or not. I would suspect
the overhead being greater than the benefits for this case.


Not sure why I'd remove a source ...


Space constraints, access to better sources, overlap, ...

Alternatively, if you would not share the names across sources, and
just use them within one source, it is highly unlikely that one name
will be used multiple times. There are exceptions such as the author
of a book section also being the editor of the book, but in most cases
the 'database' for a single source would actually be larger and
require more processing to obtain the same result.

So there are several options. And what if the
book would be replaced with a film. Then you wouldn't have an author
at all, you would have performers, directors, writers, and producers.


Each one of which is a "name," and which, again, could be associated
with several of the above categories.


You adding a code in front of every element of every entry is pretty
much the same as putting each element in between xml tags. So I guess
you could create the database yourself. You can always create a book
and a journal article through Word 2007 and study the resulting
sources.xml (located at %appdata%\Microsoft\Bibliography) and then
copy paste all other data into that xml file. That should be a bit
faster than copy/pasting everything into a source form for every
entry.


Yves


grammatim schreef:


I have, of course, many articles with bibliographies, and it would be
a lot quicker to insert tab delimiters -- or even a code -- before
each element in each entry in the list, so that the bibliography
database could be created/added to (as if a computer were involved)
without retyping every entry in toto


I should hope the software is smart enough to realize that the
"Author" of a journal article is the same sort of beast as the
"Author" of a book!


On Aug 14, 10:02 am, p0 wrote:
On 14 aug, 13:53, grammatim wrote:


The latest estimate is that I'll have my new computer (with Office 07)
this afternoon ...


Is there a way to convert an existingbibliography, i.e. formatted
list of references, into a Word 2007 table of sources (or whatever
it's called) -- are they, like, like Excel tables, or (heaven forfend)
Access tables, so that something along the lines of tab delimiters
might work?


I'm not really sure what you are trying to accomplish.


Some background: in Word 2007, bibliographic entries are actually
stored inside a custom XML file in the docx. The file, often called
item1.xml, has the following format:


b:Sources SelectedStyle="\something.xsl" StyleName="A style called
something" xmlns:b="http://schemas.openxmlformats.org/officeDocument/
2006/bibliography" xmlns="http://schemas.openxmlformats.org/
officeDocument/2006/bibliography"
b:Source...b:Source
b:Source...b:Source
b:Source...b:Source
/b:Sources


where every b:Source element represents a bibliographic source. For a
description of the content of a b:Source element, you can check out
section 7.6 of http://www.ecma-international.org/pu...MA-ST/Office%2...


Then, when the sources need to be displayed, one of the stylesheets
with the different bibliographic styles (APA, MLA, ...) gets a piece
of XML containing one or more b:Source elements and outputs a piece of
HTML. That HTML is then displayed in Word 2007 as your in-text
citation or bibliography.


Doing the reverse operation (at least that's what I think you try to
do) is not possible. You could try to create your own parser for that
but it seems overly complex to me. How would you expect a parser to be
able to identify the type of a bibliographic entry: Book, BookSection,
JournalArticle, ArticleInAPeriodical, ... And how would you make the
difference between the different contributors of a work: Author,
Artist, Editor, Translator, Writer, Producer, Performer, ...


If references to a work are available online somewhere, it might be
possible to more easily get them that way according to the following
articlehttp://savas.parastatidis.name/2007/01/25/595c0ffb-6595-41bb-9f81-952....
However, I have never seen any code actually doing this and as far as
I know, Microsoft pulled the plug on their academic search.


BR,


Yves
--http://www.codeplex.com/bibliography

  #9   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

On Aug 16, 8:23 am, p0 wrote:
grammatim schreef:



On Aug 15, 7:28 pm, p0 wrote:
grammatim schreef:


On Aug 14, 6:34 pm, p0 wrote:
The software does not support what you want. And I think it is
doubtful it ever will.


An "author" of a book is not necessarely represented by a b:Author/
b:Author element in your XML.


You seem to be overlooking the bit where I said this could be done by
inserting a code, which would still be vastly preferable to retyping
the entire contents of a full subject bibliography.


I was talking about automatically processing lists without codes. The
second part of my reply stated that as soon as you start using codes,
you can just as well use xml tags and there would be no point I doing
anything automatically.


How is an "xml tag" not a code?


It is a code, the point is, once you add the xml tags, there is no
more need for processing. You have done the entire processing by hand.
All that is left to do is copy/pasting the sources into the database
and you are done.

How is typing (or pasting) a few-letter code not better than typing
entire names, titles, etc.?


Xml was intended to be human readable. The disadvantage is that tags
tend to be long, the major advantage is that you don't have to learn a
few dozen non-descriptive codes by heart (would @c be city, comments,
chapternumber, conferencename, country, court, or casenumber?). Also
the use of closing tags is a good way in getting around punctuation
issues.

The following simple example:

John Doe. "A book by myself". London, 2008.

would result in the following string with xml 'codes':

b:Sourceb:SourceTypeBook/
b:SourceTypeb:Authorb:Authorb:NameListb:Pe rsonb:FirstJohn/
b:Firstb:LastDoe/b:Last/b:Person/b:NameList/b:Author/
b:Authorb:Garbage. "/b:Garbageb:TitleA book by myself/
b:Titleb:Garbage". /b:Garbageb:CityLondon/b:Cityb:Garbage,
/b:Garbageb:Year2008/b:Yearb:Garbage./b:Garbageb:Source


What a stupid system.

I would expect something like

auDoe, John/autiA Book by Myself/tiplLondon/plpuSmith &
Wesson/puyr2008/yr

And /yr isn't needed because it's always 4 digits, and the place and
publisher would use codes rather than spelling out: plLpuSW. au
is known to always select an item from the Name list -- as are also
ed, tr, etc.

where b:Garbage elements are meaningless elements and should be
removed from the source at a later stage. Do you really think typing/
copying/pasting out all those 'codes' is easier then copy/pasting 5
elements into a form?


A boo with a 20-word title can have six editors.

If the book would be an edited book,
than the author would be a b:Author/b:Editor element. If the book
would be a translated work, the author would be a b:Author/
b:Translator element.


If it were a proper relational database, then there would be a list of
"names," and in any particular instance, a name could be an "author,"
an "editor," a "translator," or even some combination of the above.


I am not a specialist when it comes to relational database, but I
agree that the current layout is not in full normal form. However the
way the names are stored seems to be similar as with other programs
(EndNote -http://www.endnote.com/support/helpdocs/endnote.zip).


Personally, I also see no gain in going for a relational database in
full normal form where names (b:Person elements) are put in a separate
list. For starters, if you would share the same names across multiple
sources, you would have to create a two-way link: source to one or
more names, and name to one or more sources. The reason for the first
link is obvious: indicating which names participated in the source.
The second link is necessary in case you would remove a source. You
would have to know if a name became obsolete or not. I would suspect
the overhead being greater than the benefits for this case.


Not sure why I'd remove a source ...


Space constraints, access to better sources, overlap, ...


This is scholarship, not a public lending library that only has room
for x number of books on its shelves.

Alternatively, if you would not share the names across sources, and
just use them within one source, it is highly unlikely that one name
will be used multiple times. There are exceptions such as the author
of a book section also being the editor of the book, but in most cases
the 'database' for a single source would actually be larger and
require more processing to obtain the same result.


So there are several options. And what if the
book would be replaced with a film. Then you wouldn't have an author
at all, you would have performers, directors, writers, and producers.


Each one of which is a "name," and which, again, could be associated
with several of the above categories.


You adding a code in front of every element of every entry is pretty
much the same as putting each element in between xml tags. So I guess
you could create the database yourself. You can always create a book
and a journal article through Word 2007 and study the resulting
sources.xml (located at %appdata%\Microsoft\Bibliography) and then
copy paste all other data into that xml file. That should be a bit
faster than copy/pasting everything into a source form for every
entry.


Yves


grammatim schreef:


I have, of course, many articles with bibliographies, and it would be
a lot quicker to insert tab delimiters -- or even a code -- before
each element in each entry in the list, so that the bibliography
database could be created/added to (as if a computer were involved)
without retyping every entry in toto


I should hope the software is smart enough to realize that the
"Author" of a journal article is the same sort of beast as the
"Author" of a book!


On Aug 14, 10:02 am, p0 wrote:
On 14 aug, 13:53, grammatim wrote:


The latest estimate is that I'll have my new computer (with Office 07)
this afternoon ...


Is there a way to convert an existingbibliography, i.e. formatted
list of references, into a Word 2007 table of sources (or whatever
it's called) -- are they, like, like Excel tables, or (heaven forfend)
Access tables, so that something along the lines of tab delimiters
might work?


I'm not really sure what you are trying to accomplish.


Some background: in Word 2007, bibliographic entries are actually
stored inside a custom XML file in the docx. The file, often called
item1.xml, has the following format:


b:Sources SelectedStyle="\something.xsl" StyleName="A style called
something" xmlns:b="http://schemas.openxmlformats.org/officeDocument/
2006/bibliography" xmlns="http://schemas.openxmlformats.org/
officeDocument/2006/bibliography"
b:Source...b:Source
b:Source...b:Source
b:Source...b:Source
/b:Sources


where every b:Source element represents a bibliographic source. For a
description of the content of a b:Source element, you can check out
section 7.6 of http://www.ecma-international.org/pu...MA-ST/Office%2...


Then, when the sources need to be displayed, one of the stylesheets
with the different bibliographic styles (APA, MLA, ...) gets a piece
of XML containing one or more b:Source elements and outputs a piece of
HTML. That HTML is then displayed in Word 2007 as your in-text
citation or bibliography.


Doing the reverse operation (at least that's what I think you try to
do) is not possible. You could try to create your own parser for that
but it seems overly complex to me. How would you expect a parser to be
able to identify the type of a bibliographic entry: Book, BookSection,
JournalArticle, ArticleInAPeriodical, ... And how would you make the
difference between the different contributors of a work: Author,
Artist, Editor, Translator, Writer, Producer, Performer, ...


If references to a work are available online somewhere, it might be
possible to more easily get them that way according to the following
articlehttp://savas.parastatidis.name/2007/01/25/595c0ffb-6595-41bb-9f81-952....
However, I have never seen any code actually doing this and as far as
I know, Microsoft pulled the plug on their academic search.


BR,


Yves
--http://www.codeplex.com/bibliography


  #10   Report Post  
Posted to microsoft.public.word.docmanagement
p0 p0 is offline
external usenet poster
 
Posts: 254
Default text to bibliography?



grammatim schreef:
On Aug 16, 8:23 am, p0 wrote:
grammatim schreef:



On Aug 15, 7:28 pm, p0 wrote:
grammatim schreef:


On Aug 14, 6:34 pm, p0 wrote:
The software does not support what you want. And I think it is
doubtful it ever will.


An "author" of a book is not necessarely represented by a b:Author/
b:Author element in your XML.


You seem to be overlooking the bit where I said this could be done by
inserting a code, which would still be vastly preferable to retyping
the entire contents of a full subject bibliography.


I was talking about automatically processing lists without codes. The
second part of my reply stated that as soon as you start using codes,
you can just as well use xml tags and there would be no point I doing
anything automatically.


How is an "xml tag" not a code?


It is a code, the point is, once you add the xml tags, there is no
more need for processing. You have done the entire processing by hand.
All that is left to do is copy/pasting the sources into the database
and you are done.

How is typing (or pasting) a few-letter code not better than typing
entire names, titles, etc.?


Xml was intended to be human readable. The disadvantage is that tags
tend to be long, the major advantage is that you don't have to learn a
few dozen non-descriptive codes by heart (would @c be city, comments,
chapternumber, conferencename, country, court, or casenumber?). Also
the use of closing tags is a good way in getting around punctuation
issues.

The following simple example:

John Doe. "A book by myself". London, 2008.

would result in the following string with xml 'codes':

b:Sourceb:SourceTypeBook/
b:SourceTypeb:Authorb:Authorb:NameListb:Pe rsonb:FirstJohn/
b:Firstb:LastDoe/b:Last/b:Person/b:NameList/b:Author/
b:Authorb:Garbage. "/b:Garbageb:TitleA book by myself/
b:Titleb:Garbage". /b:Garbageb:CityLondon/b:Cityb:Garbage,
/b:Garbageb:Year2008/b:Yearb:Garbage./b:Garbageb:Source


What a stupid system.


I didn't design the xml schema for bibliographies, you will have to
take that one up with Microsoft :-).

On a side note, the beauty of custom xml in ooxml is that you can
define your own way of storing data. And you don't even have to stick
to xml: you can store binary data in an xml file. So if you really are
unhappy with the format, you can easily extend Word with your own set
of bibliographic tools.

I would expect something like

auDoe, John/autiA Book by Myself/tiplLondon/plpuSmith &
Wesson/puyr2008/yr

And /yr isn't needed because it's always 4 digits, and the place and
publisher would use codes rather than spelling out: plLpuSW. au
is known to always select an item from the Name list -- as are also
ed, tr, etc.


What would "ed" be? editor? edition? The entire point of using full
discriptive names in tags rather than crafty shortcuts is to make
things clear for the people who have to add them. Yes you will have to
type more, but at least elements will be defined in such a way that
there is no confusion for the user. And for non-english speaking
people, full words are a lot easier to understand than shady
abbreviations.

In your au how would you see the difference between first, middle
and last names? And what if your author is a corporation? In that
case, it wouldn't be part of a namelist.

And a year is not always displayed with 4 digits, some styles require
you to only print 2. And what if a range of years would be entered?
2008-2009 or 2008-09 or 08-09 ... And I haven't come across it, but I
wouldn't be surprised if some crazy citation style requires you to
enter the year in roman numerals MMVIII. The point is, closing tags
are necessary to define boundaries. In your version you are already
conveniently letting out punctuation.

Why would L represent London? To me, it represents Leichester. Once
again, the small gain you can get with your code is hardly worth the
effort and confusion you introduce. BibTeX allows for the usage of
codes. So there you could define L for London. But it is never used
like that. The only usage I have seen of codes is for the abbreviation
of journal names (which is really a small bunch since they are grouped
on topic) and the localization of month names.

You can come up with dozens of shortcuts to store and process
bibliographic data but with every shortcut you introduce, you get rid
of functionality that others might need, and/or trade in usability for
(non-expert) users.

where b:Garbage elements are meaningless elements and should be
removed from the source at a later stage. Do you really think typing/
copying/pasting out all those 'codes' is easier then copy/pasting 5
elements into a form?


A boo with a 20-word title can have six editors.

If the book would be an edited book,
than the author would be a b:Author/b:Editor element. If the book
would be a translated work, the author would be a b:Author/
b:Translator element.


If it were a proper relational database, then there would be a list of
"names," and in any particular instance, a name could be an "author,"
an "editor," a "translator," or even some combination of the above.


I am not a specialist when it comes to relational database, but I
agree that the current layout is not in full normal form. However the
way the names are stored seems to be similar as with other programs
(EndNote -http://www.endnote.com/support/helpdocs/endnote.zip).


Personally, I also see no gain in going for a relational database in
full normal form where names (b:Person elements) are put in a separate
list. For starters, if you would share the same names across multiple
sources, you would have to create a two-way link: source to one or
more names, and name to one or more sources. The reason for the first
link is obvious: indicating which names participated in the source.
The second link is necessary in case you would remove a source. You
would have to know if a name became obsolete or not. I would suspect
the overhead being greater than the benefits for this case.


Not sure why I'd remove a source ...


Space constraints, access to better sources, overlap, ...


This is scholarship, not a public lending library that only has room
for x number of books on its shelves.


Conference papers are limited in number of pages. And if you have to
pick between reporting your data or having an extra reference, the
reference is normally the first to go.

Alternatively, if you would not share the names across sources, and
just use them within one source, it is highly unlikely that one name
will be used multiple times. There are exceptions such as the author
of a book section also being the editor of the book, but in most cases
the 'database' for a single source would actually be larger and
require more processing to obtain the same result.


So there are several options. And what if the
book would be replaced with a film. Then you wouldn't have an author
at all, you would have performers, directors, writers, and producers.


Each one of which is a "name," and which, again, could be associated
with several of the above categories.


You adding a code in front of every element of every entry is pretty
much the same as putting each element in between xml tags. So I guess
you could create the database yourself. You can always create a book
and a journal article through Word 2007 and study the resulting
sources.xml (located at %appdata%\Microsoft\Bibliography) and then
copy paste all other data into that xml file. That should be a bit
faster than copy/pasting everything into a source form for every
entry.


Yves


grammatim schreef:


I have, of course, many articles with bibliographies, and it would be
a lot quicker to insert tab delimiters -- or even a code -- before
each element in each entry in the list, so that the bibliography
database could be created/added to (as if a computer were involved)
without retyping every entry in toto


I should hope the software is smart enough to realize that the
"Author" of a journal article is the same sort of beast as the
"Author" of a book!


On Aug 14, 10:02 am, p0 wrote:
On 14 aug, 13:53, grammatim wrote:


The latest estimate is that I'll have my new computer (with Office 07)
this afternoon ...


Is there a way to convert an existingbibliography, i.e. formatted
list of references, into a Word 2007 table of sources (or whatever
it's called) -- are they, like, like Excel tables, or (heaven forfend)
Access tables, so that something along the lines of tab delimiters
might work?


I'm not really sure what you are trying to accomplish.


Some background: in Word 2007, bibliographic entries are actually
stored inside a custom XML file in the docx. The file, often called
item1.xml, has the following format:


b:Sources SelectedStyle="\something.xsl" StyleName="A style called
something" xmlns:b="http://schemas.openxmlformats.org/officeDocument/
2006/bibliography" xmlns="http://schemas.openxmlformats.org/
officeDocument/2006/bibliography"
b:Source...b:Source
b:Source...b:Source
b:Source...b:Source
/b:Sources


where every b:Source element represents a bibliographic source. For a
description of the content of a b:Source element, you can check out
section 7.6 of http://www.ecma-international.org/pu...MA-ST/Office%2...


Then, when the sources need to be displayed, one of the stylesheets
with the different bibliographic styles (APA, MLA, ...) gets a piece
of XML containing one or more b:Source elements and outputs a piece of
HTML. That HTML is then displayed in Word 2007 as your in-text
citation or bibliography.


Doing the reverse operation (at least that's what I think you try to
do) is not possible. You could try to create your own parser for that
but it seems overly complex to me. How would you expect a parser to be
able to identify the type of a bibliographic entry: Book, BookSection,
JournalArticle, ArticleInAPeriodical, ... And how would you make the
difference between the different contributors of a work: Author,
Artist, Editor, Translator, Writer, Producer, Performer, ...


If references to a work are available online somewhere, it might be
possible to more easily get them that way according to the following
articlehttp://savas.parastatidis.name/2007/01/25/595c0ffb-6595-41bb-9f81-952....
However, I have never seen any code actually doing this and as far as
I know, Microsoft pulled the plug on their academic search.


BR,


Yves
--http://www.codeplex.com/bibliography



  #11   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

On Aug 16, 7:33 pm, p0 wrote:
grammatim schreef:



On Aug 16, 8:23 am, p0 wrote:
grammatim schreef:


On Aug 15, 7:28 pm, p0 wrote:
grammatim schreef:


On Aug 14, 6:34 pm, p0 wrote:
The software does not support what you want. And I think it is
doubtful it ever will.


An "author" of a book is not necessarely represented by a b:Author/
b:Author element in your XML.


You seem to be overlooking the bit where I said this could be done by
inserting a code, which would still be vastly preferable to retyping
the entire contents of a full subject bibliography.


I was talking about automatically processing lists without codes. The
second part of my reply stated that as soon as you start using codes,
you can just as well use xml tags and there would be no point I doing
anything automatically.


How is an "xml tag" not a code?


It is a code, the point is, once you add the xml tags, there is no
more need for processing. You have done the entire processing by hand..
All that is left to do is copy/pasting the sources into the database
and you are done.


How is typing (or pasting) a few-letter code not better than typing
entire names, titles, etc.?


Xml was intended to be human readable. The disadvantage is that tags
tend to be long, the major advantage is that you don't have to learn a
few dozen non-descriptive codes by heart (would @c be city, comments,
chapternumber, conferencename, country, court, or casenumber?). Also
the use of closing tags is a good way in getting around punctuation
issues.


The following simple example:


John Doe. "A book by myself". London, 2008.


would result in the following string with xml 'codes':


b:Sourceb:SourceTypeBook/
b:SourceTypeb:Authorb:Authorb:NameListb:Pe rsonb:FirstJohn/
b:Firstb:LastDoe/b:Last/b:Person/b:NameList/b:Author/
b:Authorb:Garbage. "/b:Garbageb:TitleA book by myself/
b:Titleb:Garbage". /b:Garbageb:CityLondon/b:Cityb:Garbage,
/b:Garbageb:Year2008/b:Yearb:Garbage./b:Garbageb:Source


What a stupid system.


I didn't design the xml schema for bibliographies, you will have to
take that one up with Microsoft :-).


That's why it wasn't rude to call it stupid!

On a side note, the beauty of custom xml in ooxml is that you can
define your own way of storing data. And you don't even have to stick
to xml: you can store binary data in an xml file. So if you really are
unhappy with the format, you can easily extend Word with your own set
of bibliographic tools.


I don't know what any of that means.

I would expect something like


auDoe, John/autiA Book by Myself/tiplLondon/plpuSmith &
Wesson/puyr2008/yr


And /yr isn't needed because it's always 4 digits, and the place and
publisher would use codes rather than spelling out: plLpuSW. au
is known to always select an item from the Name list -- as are also
ed, tr, etc.


What would "ed" be? editor? edition?


ed vs. edn

The entire point of using full
discriptive names in tags rather than crafty shortcuts is to make
things clear for the people who have to add them.


But the people shouldn't ever need to see them! They should see a form
to fill in, with each slot labeled with the category that goes in it.
"Author" would have a drop-down list of all Names, since most subject
bibliographies involve several works by the same person. (Likewise for
"Place" and "Publisher.")

Yes you will have to
type more, but at least elements will be defined in such a way that
there is no confusion for the user. And for non-english speaking
people, full words are a lot easier to understand than shady
abbreviations.


Not at all problem if you have an internationalizationized, or
whatever they call it, interface.

In your au how would you see the difference between first, middle
and last names? And what if your author is a corporation? In that
case, it wouldn't be part of a namelist.


Corporations don't author scholarly works.

And a year is not always displayed with 4 digits, some styles require
you to only print 2. And what if a range of years would be entered?
2008-2009 or 2008-09 or 08-09 ... And I haven't come across it, but I
wouldn't be surprised if some crazy citation style requires you to
enter the year in roman numerals MMVIII. The point is, closing tags
are necessary to define boundaries. In your version you are already
conveniently letting out punctuation.

Why would L represent London? To me, it represents Leichester.


How many publishers are headquartered in Leichester, wherever that is?

Once
again, the small gain you can get with your code is hardly worth the
effort and confusion you introduce. BibTeX allows for the usage of
codes. So there you could define L for London. But it is never used
like that. The only usage I have seen of codes is for the abbreviation
of journal names (which is really a small bunch since they are grouped
on topic) and the localization of month names.

You can come up with dozens of shortcuts to store and process
bibliographic data but with every shortcut you introduce, you get rid
of functionality that others might need, and/or trade in usability for
(non-expert) users.


Have a look at the, alas, defunct Mac program Papyrus (it wasn't worth
the effort for the creator to adapt it for OS X, so he just offers it
as freeware to anyone with a "legacy system," but its discussion list
was still active back when I had to abandon the Mac, two+ years ago).

where b:Garbage elements are meaningless elements and should be
removed from the source at a later stage. Do you really think typing/
copying/pasting out all those 'codes' is easier then copy/pasting 5
elements into a form?


A boo with a 20-word title can have six editors.


If the book would be an edited book,
than the author would be a b:Author/b:Editor element. If the book
would be a translated work, the author would be a b:Author/
b:Translator element.


If it were a proper relational database, then there would be a list of
"names," and in any particular instance, a name could be an "author,"
an "editor," a "translator," or even some combination of the above.


I am not a specialist when it comes to relational database, but I
agree that the current layout is not in full normal form. However the
way the names are stored seems to be similar as with other programs
(EndNote -http://www.endnote.com/support/helpdocs/endnote.zip).


Personally, I also see no gain in going for a relational database in
full normal form where names (b:Person elements) are put in a separate
list. For starters, if you would share the same names across multiple
sources, you would have to create a two-way link: source to one or
more names, and name to one or more sources. The reason for the first
link is obvious: indicating which names participated in the source.
The second link is necessary in case you would remove a source. You
would have to know if a name became obsolete or not. I would suspect
the overhead being greater than the benefits for this case.


Not sure why I'd remove a source ...


Space constraints, access to better sources, overlap, ...


This is scholarship, not a public lending library that only has room
for x number of books on its shelves.


Conference papers are limited in number of pages. And if you have to
pick between reporting your data or having an extra reference, the
reference is normally the first to go.


We're talking about a bibliographic database, not a list of
references.

read more »


[I have to Send before I can see if you've added anything below here]
  #12   Report Post  
Posted to microsoft.public.word.docmanagement
p0 p0 is offline
external usenet poster
 
Posts: 254
Default text to bibliography?

I'm stripping parts from the original message as it has become too
large to process decently.

On a side note, the beauty of custom xml in ooxml is that you can
define your own way of storing data. And you don't even have to stick
to xml: you can store binary data in an xml file. So if you really are
unhappy with the format, you can easily extend Word with your own set
of bibliographic tools.


I don't know what any of that means.


Well, if you are concerned with size (little tags rather than big
ones), you are in for a surprise, your Word document actually contains
all bibliographic data twice (talking about overkill).

What you see as a docx file is nothing more than a zip-file. So if you
change the extension from docx to zip (make sure you have a backup),
you can use the compressed folders utility from Windows or an external
program such as WinRAR or WinZip to extract the contents of your
document. In it, you will normally find a file item1.xml in the
customXml directory. That is actually an xml notation of all the
bibliographic data in your source. You will also find a document.xml
file in the word directory. That file contains your entire text
including your well-formatted bibliography (no longer in xml format).
It is nice separation between the data and the view on the data.

So what I meant was, if you aren't happy with the current internal
data layout, you can very well define your own layout and then format
the data in the document.xml according to your layout (stored in your
version of item1.xml) and preferences.

What would "ed" be? editor? edition?


ed vs. edn


And then I would think about "editorial notes". Really, shortcutting
data entries to save space is, in my personal opinion, about the worst
thing you can do. EndNote allows importing data based on shortcut
codes. But once imported, the data is once again stored in
'understandable' xml as it should be done. And luckely for that,
because nobody without a decent manual would be able to figure out
that %I is actually the field representing the publisher.

The entire point of using full
discriptive names in tags rather than crafty shortcuts is to make
things clear for the people who have to add them.


But the people shouldn't ever need to see them! They should see a form
to fill in, with each slot labeled with the category that goes in it.
"Author" would have a drop-down list of all Names, since most subject
bibliographies involve several works by the same person. (Likewise for
"Place" and "Publisher.")

Yes you will have to
type more, but at least elements will be defined in such a way that
there is no confusion for the user. And for non-english speaking
people, full words are a lot easier to understand than shady
abbreviations.


Not at all problem if you have an internationalizationized, or
whatever they call it, interface.


That's what the source form (insert new citation) is for in Word 2007.
Check your computer for a bibform.xml, if you are using an en-us
version of word, it should be in word directory\1033\bibliography
\bibform.xml. For other languages, you will have to replace 1033 with
your local culture id (lcid). The file contains a mapping of localized
strings (Label element) to xml tags (DataTag element). On a side note,
the bibform.xml claims to follow the bibliography xml schema (default
namespace) but it is not doing so since the schema does not define
anything about the mapping.

Have a look at the, alas, defunct Mac program Papyrus (it wasn't worth
the effort for the creator to adapt it for OS X, so he just offers it
as freeware to anyone with a "legacy system," but its discussion list
was still active back when I had to abandon the Mac, two+ years ago).


The setup of this tool is totally different, this is a tool for
storing and searching bibliographic information, even entire
libraries. As a side product, it also allows you to format the output
a bit. Microsoft's tool is intended only for providing formatted
output. They don't care about maintaining a library where you can find
stuff by keywords or authors or ...


But all this is besides the point, the original topic was about adding
textual sources to your document in an automated way. I have seen some
tools for converting BibTeX or EndNote files into Word 2007 sources.
And you can always create a converter which translates your home-made
format into Microsoft's format, but you can't expect Microsoft to
support your format by default. They have a format, and you either
stick to it, or you design something else (which is pretty easy using
custom xml). The choice is up to you.
  #13   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

On Aug 17, 8:49 am, p0 wrote:
I'm stripping parts from the original message as it has become too
large to process decently.


Quite!

On a side note, the beauty of custom xml in ooxml is that you can
define your own way of storing data. And you don't even have to stick
to xml: you can store binary data in an xml file. So if you really are
unhappy with the format, you can easily extend Word with your own set
of bibliographic tools.


I don't know what any of that means.


Well, if you are concerned with size (little tags rather than big
ones), you are in for a surprise, your Word document actually contains
all bibliographic data twice (talking about overkill).

What you see as a docx file is nothing more than a zip-file. So if you
change the extension from docx to zip (make sure you have a backup),
you can use the compressed folders utility from Windows or an external
program such as WinRAR or WinZip to extract the contents of your
document. In it, you will normally find a file item1.xml in the
customXml directory. That is actually an xml notation of all the
bibliographic data in your source. You will also find a document.xml
file in the word directory. That file contains your entire text
including your well-formatted bibliography (no longer in xml format).
It is nice separation between the data and the view on the data.

So what I meant was, if you aren't happy with the current internal
data layout, you can very well define your own layout and then format
the data in the document.xml according to your layout (stored in your
version of item1.xml) and preferences.


I've no idea what the current internal data layout may be, nor should
I. As an end user, I expect the product to work as it should.

What would "ed" be? editor? edition?


ed vs. edn


And then I would think about "editorial notes".


Sorry, but "editorial notes" is not a category that appears in a
bibliography. It loos as though you are looking for details to
complain about, rather than understanding the user's needs.

Really, shortcutting
data entries to save space is, in my personal opinion, about the worst
thing you can do.


Not sure what "data entries" are, but if you're referring to entering
data, you're wrong.

EndNote allows importing data based on shortcut
codes. But once imported, the data is once again stored in
'understandable' xml as it should be done. And luckely for that,
because nobody without a decent manual would be able to figure out
that %I is actually the field representing the publisher.


Why would anyone ever need to "figure out" such a thing?

The entire point of using full
discriptive names in tags rather than crafty shortcuts is to make
things clear for the people who have to add them.


But the people shouldn't ever need to see them! They should see a form
to fill in, with each slot labeled with the category that goes in it.
"Author" would have a drop-down list of all Names, since most subject
bibliographies involve several works by the same person. (Likewise for
"Place" and "Publisher.")


Yes you will have to
type more, but at least elements will be defined in such a way that
there is no confusion for the user. And for non-english speaking
people, full words are a lot easier to understand than shady
abbreviations.


Not at all problem if you have an internationalizationized, or
whatever they call it, interface.


That's what the source form (insert new citation) is for in Word 2007.
Check your computer for a bibform.xml, if you are using an en-us
version of word, it should be in word directory\1033\bibliography
\bibform.xml. For other languages, you will have to replace 1033 with
your local culture id (lcid). The file contains a mapping of localized
strings (Label element) to xml tags (DataTag element). On a side note,
the bibform.xml claims to follow the bibliography xml schema (default
namespace) but it is not doing so since the schema does not define
anything about the mapping.


Yes, I'll be sure to do all that as soon as I have my new system.
(Which didn't happen yesterday, without even a phone call to move it
to today.)

Have a look at the, alas, defunct Mac program Papyrus (it wasn't worth
the effort for the creator to adapt it for OS X, so he just offers it
as freeware to anyone with a "legacy system," but its discussion list
was still active back when I had to abandon the Mac, two+ years ago).


The setup of this tool is totally different, this is a tool for
storing and searching bibliographic information, even entire
libraries. As a side product, it also allows you to format the output
a bit. Microsoft's tool is intended only for providing formatted
output. They don't care about maintaining a library where you can find
stuff by keywords or authors or ...

But all this is besides the point, the original topic was about adding
textual sources to your document in an automated way. I have seen some
tools for converting BibTeX or EndNote files into Word 2007 sources.
And you can always create a converter which translates your home-made
format into Microsoft's format, but you can't expect Microsoft to
support your format by default. They have a format, and you either
stick to it, or you design something else (which is pretty easy using
custom xml). The choice is up to you.


I am not talking about "formats." I am talking about plain text, plain
text that looks exactly the way published bibliographies have looked
for about a century now.

It doesn't seem too much to ask that "Text to Table" could come up
with a tabular presentation, which some other module could then
convert to the "format" used by the bibliographic database: if it
knows that col. 1 is the author, col. 2 is the date, col. 3 is the
title, col. 4 is the place, and col. 5 is the publisher (that's a
basic Book entry), why can't it simply do that?
  #14   Report Post  
Posted to microsoft.public.word.docmanagement
p0 p0 is offline
external usenet poster
 
Posts: 254
Default text to bibliography?

On 17 aug, 15:14, grammatim wrote:
On Aug 17, 8:49 am, p0 wrote:

I'm stripping parts from the original message as it has become too
large to process decently.


Quite!





On a side note, the beauty of custom xml in ooxml is that you can
define your own way of storing data. And you don't even have to stick
to xml: you can store binary data in an xml file. So if you really are
unhappy with the format, you can easily extend Word with your own set
of bibliographic tools.


I don't know what any of that means.


Well, if you are concerned with size (little tags rather than big
ones), you are in for a surprise, your Word document actually contains
all bibliographic data twice (talking about overkill).


What you see as a docx file is nothing more than a zip-file. So if you
change the extension from docx to zip (make sure you have a backup),
you can use the compressed folders utility from Windows or an external
program such as WinRAR or WinZip to extract the contents of your
document. In it, you will normally find a file item1.xml in the
customXml directory. That is actually an xml notation of all the
bibliographic data in your source. You will also find a document.xml
file in the word directory. That file contains your entire text
including your well-formattedbibliography(no longer in xml format).
It is nice separation between the data and the view on the data.


So what I meant was, if you aren't happy with the current internal
data layout, you can very well define your own layout and then format
the data in the document.xml according to your layout (stored in your
version of item1.xml) and preferences.


I've no idea what the current internal data layout may be, nor should
I. As an end user, I expect the product to work as it should.


It is true, you shouldn't know and you don't have to. All you have to
do, is fill in the form which is presented when you want to enter a
citation. As soon as you want more than that, like having Word to
understand your way of formatting (be it tables, binary structures,
static text, ...), then it is up to you to learn the underlying format
and convert your datastructures to the underlying format. As an
alternative, you can of course extend the underlying format (part 5 of
the office open xml specification).

What would "ed" be? editor? edition?


ed vs. edn


And then I would think about "editorial notes".


Sorry, but "editorial notes" is not a category that appears in abibliography. It loos as though you are looking for details to
complain about, rather than understanding the user's needs.


It is in annotated bibliographies (something Word 2007 does not
support by the way).


Really, shortcutting
data entries to save space is, in my personal opinion, about the worst
thing you can do.


Not sure what "data entries" are, but if you're referring to entering
data, you're wrong.

EndNote allows importing data based on shortcut
codes. But once imported, the data is once again stored in
'understandable' xml as it should be done. And luckely for that,
because nobody without a decent manual would be able to figure out
that %I is actually the field representing the publisher.


Why would anyone ever need to "figure out" such a thing?



Well clearly you would, since you would add the code to your current
static text to convert it into a bibliographic source.




The entire point of using full
discriptive names in tags rather than crafty shortcuts is to make
things clear for the people who have to add them.


But the people shouldn't ever need to see them! They should see a form
to fill in, with each slot labeled with the category that goes in it.
"Author" would have a drop-down list of all Names, since most subject
bibliographies involve several works by the same person. (Likewise for
"Place" and "Publisher.")


Yes you will have to
type more, but at least elements will be defined in such a way that
there is no confusion for the user. And for non-english speaking
people, full words are a lot easier to understand than shady
abbreviations.


Not at all *problem if you have an internationalizationized, or
whatever they call it, interface.


That's what the source form (insert newcitation) is for in Word 2007.
Check your computer for a bibform.xml, if you are using an en-us
version of word, it should be in word directory\1033\bibliography
\bibform.xml. For other languages, you will have to replace 1033 with
your local culture id (lcid). The file contains a mapping of localized
strings (Label element) to xml tags (DataTag element). On a side note,
the bibform.xml claims to follow thebibliographyxml schema (default
namespace) but it is not doing so since the schema does not define
anything about the mapping.


Yes, I'll be sure to do all that as soon as I have my new system.
(Which didn't happen yesterday, without even a phone call to move it
to today.)





Have a look at the, alas, defunct Mac program Papyrus (it wasn't worth
the effort for the creator to adapt it for OS X, so he just offers it
as freeware to anyone with a "legacy system," but its discussion list
was still active back when I had to abandon the Mac, two+ years ago).


The setup of this tool is totally different, this is a tool for
storing and searching bibliographic information, even entire
libraries. As a side product, it also allows you to format the output
a bit. Microsoft's tool is intended only for providing formatted
output. They don't care about maintaining a library where you can find
stuff by keywords or authors or ...


But all this is besides the point, the original topic was about adding
textual sources to your document in an automated way. I have seen some
tools for converting BibTeX or EndNote files into Word 2007 sources.
And you can always create a converter which translates your home-made
format into Microsoft's format, but you can't expect Microsoft to
support your format by default. They have a format, and you either
stick to it, or you design something else (which is pretty easy using
custom xml). The choice is up to you.


I am not talking about "formats." I am talking about plain text, plain
text that looks exactly the way published bibliographies have looked
for about a century now.


And how do they look? Currently, my EndNote X1 style directory comes
with close to 3000 styles (2932 actually, but I have not downloaded
all available styles from their site). So this means, I currently have
3000 plain text versions of published bibliographies for a single
source. Are you going to write a converter which figures out which one
of those 3000 is used? Because you will have to before even starting
to parse the static text within one entry into a source.

Even within the same scientific journal, bibliographies tend to be
formatted differently.

It doesn't seem too much to ask that "Text to Table" could come up
with a tabular presentation, which some other module could then
convert to the "format" used by the bibliographic database: if it
knows that col. 1 is the author, col. 2 is the date, col. 3 is the
title, col. 4 is the place, and col. 5 is the publisher (that's a
basic Book entry), why can't it simply do that?


Now you are no longer talking static text, you are talking (poorly)
formatted text. And you would have to have a tool to map columns to
fields, since in my case, year should be the last entry (except maybe
for pages) in my bibliography and most certainly not the second.

And in your case, how is your book displayed if it is an anonymous
work? I would guess col. 1 is the title, col. 2 is the date, col. 3 is
the place, and col.4 is the publisher. So even between 2 entries of
the same type, the ordering of data would be different.

Maybe you don't have anonymous works, but it doesn't matter. What you
require is so specifc that you will probably be the only one using the
'import filter' anyway. The point is, Microsoft provides a set of
generic tools which works for 80% of their customers. There is no
point for them in developing a tool which will work in a very specifc
case (yours) and therefore will target 1% or less of their customer
base. If you want one, you will have to write it yourself. They
provide a specifcation of the bibliography format and even provide a
programming interface (I have no experience with it). They try to help
you a long way, but the last few steps you will have to take yourself.

And before you start thinking that I am a Microsoft evangelist, I am
most defintely not. I can point out at least half a dozen flaws with
the current bibliographic tools. Going from simple bugs to major
design issues. But those are not the point of this thread :-)
  #15   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

On Aug 17, 10:01 am, p0 wrote:
On 17 aug, 15:14, grammatim wrote:
On Aug 17, 8:49 am, p0 wrote:


I'm stripping parts from the original message as it has become too
large to process decently.


Quite!


On a side note, the beauty of custom xml in ooxml is that you can
define your own way of storing data. And you don't even have to stick
to xml: you can store binary data in an xml file. So if you really are
unhappy with the format, you can easily extend Word with your own set
of bibliographic tools.


I don't know what any of that means.


Well, if you are concerned with size (little tags rather than big
ones), you are in for a surprise, your Word document actually contains
all bibliographic data twice (talking about overkill).


What you see as a docx file is nothing more than a zip-file. So if you
change the extension from docx to zip (make sure you have a backup),
you can use the compressed folders utility from Windows or an external
program such as WinRAR or WinZip to extract the contents of your
document. In it, you will normally find a file item1.xml in the
customXml directory. That is actually an xml notation of all the
bibliographic data in your source. You will also find a document.xml
file in the word directory. That file contains your entire text
including your well-formattedbibliography(no longer in xml format).
It is nice separation between the data and the view on the data.


So what I meant was, if you aren't happy with the current internal
data layout, you can very well define your own layout and then format
the data in the document.xml according to your layout (stored in your
version of item1.xml) and preferences.


I've no idea what the current internal data layout may be, nor should
I. As an end user, I expect the product to work as it should.


It is true, you shouldn't know and you don't have to. All you have to
do, is fill in the form which is presented when you want to enter a
citation. As soon as you want more than that, like having Word to
understand your way of formatting (be it tables, binary structures,
static text, ...), then it is up to you to learn the underlying format
and convert your datastructures to the underlying format. As an
alternative, you can of course extend the underlying format (part 5 of
the office open xml specification).

What would "ed" be? editor? edition?


ed vs. edn


And then I would think about "editorial notes".


Sorry, but "editorial notes" is not a category that appears in abibliography. It loos as though you are looking for details to
complain about, rather than understanding the user's needs.


It is in annotated bibliographies (something Word 2007 does not
support by the way).

Really, shortcutting
data entries to save space is, in my personal opinion, about the worst
thing you can do.


Not sure what "data entries" are, but if you're referring to entering
data, you're wrong.


EndNote allows importing data based on shortcut
codes. But once imported, the data is once again stored in
'understandable' xml as it should be done. And luckely for that,
because nobody without a decent manual would be able to figure out
that %I is actually the field representing the publisher.


Why would anyone ever need to "figure out" such a thing?


Well clearly you would, since you would add the code to your current
static text to convert it into a bibliographic source.


But why would you need to "figure it out"? (Oh, that's right, apps
don't come with instructions any more -- developers think they're
usable out of the box with no preparation.)

The entire point of using full
discriptive names in tags rather than crafty shortcuts is to make
things clear for the people who have to add them.


But the people shouldn't ever need to see them! They should see a form
to fill in, with each slot labeled with the category that goes in it.
"Author" would have a drop-down list of all Names, since most subject
bibliographies involve several works by the same person. (Likewise for
"Place" and "Publisher.")


Yes you will have to
type more, but at least elements will be defined in such a way that
there is no confusion for the user. And for non-english speaking
people, full words are a lot easier to understand than shady
abbreviations.


Not at all problem if you have an internationalizationized, or
whatever they call it, interface.


That's what the source form (insert newcitation) is for in Word 2007.
Check your computer for a bibform.xml, if you are using an en-us
version of word, it should be in word directory\1033\bibliography
\bibform.xml. For other languages, you will have to replace 1033 with
your local culture id (lcid). The file contains a mapping of localized
strings (Label element) to xml tags (DataTag element). On a side note,
the bibform.xml claims to follow thebibliographyxml schema (default
namespace) but it is not doing so since the schema does not define
anything about the mapping.


Yes, I'll be sure to do all that as soon as I have my new system.
(Which didn't happen yesterday, without even a phone call to move it
to today.)


Now it's going to be tomorrow afternoon ...

Have a look at the, alas, defunct Mac program Papyrus (it wasn't worth
the effort for the creator to adapt it for OS X, so he just offers it
as freeware to anyone with a "legacy system," but its discussion list
was still active back when I had to abandon the Mac, two+ years ago).


The setup of this tool is totally different, this is a tool for
storing and searching bibliographic information, even entire
libraries. As a side product, it also allows you to format the output
a bit. Microsoft's tool is intended only for providing formatted
output. They don't care about maintaining a library where you can find
stuff by keywords or authors or ...


But all this is besides the point, the original topic was about adding
textual sources to your document in an automated way. I have seen some
tools for converting BibTeX or EndNote files into Word 2007 sources.
And you can always create a converter which translates your home-made
format into Microsoft's format, but you can't expect Microsoft to
support your format by default. They have a format, and you either
stick to it, or you design something else (which is pretty easy using
custom xml). The choice is up to you.


I am not talking about "formats." I am talking about plain text, plain
text that looks exactly the way published bibliographies have looked
for about a century now.


And how do they look?


They look like what the Chicago Manual of Style says they should look
like, or a reasonable approximation thereto.

Currently, my EndNote X1 style directory comes
with close to 3000 styles (2932 actually, but I have not downloaded
all available styles from their site). So this means, I currently have
3000 plain text versions of published bibliographies for a single
source. Are you going to write a converter which figures out which one
of those 3000 is used? Because you will have to before even starting
to parse the static text within one entry into a source.


Sounds like Microsoft-type overkill. Is that why the price is so
prohibitively high?

It ought to come with a dozen or so standard output styles, and the
ability to combine the building blocks plus punctuation into any
additional styles one might encounter.

Even within the same scientific journal, bibliographies tend to be
formatted differently.


Not if the copyeditors are doing their job. (I was a Manuscript Editor
at Astrophysical Journal for two years.)

It doesn't seem too much to ask that "Text to Table" could come up
with a tabular presentation, which some other module could then
convert to the "format" used by the bibliographic database: if it
knows that col. 1 is the author, col. 2 is the date, col. 3 is the
title, col. 4 is the place, and col. 5 is the publisher (that's a
basic Book entry), why can't it simply do that?


Now you are no longer talking static text, you are talking (poorly)


"Poorly"? CMS has been around since 1906 and is by far the leading
style guide in the US.

formatted text. And you would have to have a tool to map columns to
fields, since in my case, year should be the last entry (except maybe
for pages) in my bibliography and most certainly not the second.


So you don't do author-date references in the text? Fine. That would
be Chicago's "Humanities" style.

And in your case, how is your book displayed if it is an anonymous
work? I would guess col. 1 is the title, col. 2 is the date, col. 3 is
the place, and col.4 is the publisher. So even between 2 entries of
the same type, the ordering of data would be different.


No, col. 1 would be empty. (Though there are circumstances in which
the author is entered as "Anonymous"; see CMS.)

Maybe you don't have anonymous works, but it doesn't matter. What you
require is so specifc that you will probably be the only one using the
'import filter' anyway. The point is, Microsoft provides a set of
generic tools which works for 80% of their customers. There is no


You are again losing sight of the point. They provide _no_ tool for
going from an existing bibliography to the bibliography database.

point for them in developing a tool which will work in a very specifc
case (yours) and therefore will target 1% or less of their customer
base. If you want one, you will have to write it yourself. They
provide a specifcation of the bibliography format and even provide a
programming interface (I have no experience with it). They try to help
you a long way, but the last few steps you will have to take yourself.


On the occasions when programming new reference styles has been
mentioned here, the MVPs have stated it appears to be impossibly
complicated to do so.

I will soon find out how greatly it respects CMS style, especially for
complicated entries.

And before you start thinking that I am a Microsoft evangelist, I am
most defintely not. I can point out at least half a dozen flaws with
the current bibliographic tools. Going from simple bugs to major
design issues. But those are not the point of this thread :-)



  #16   Report Post  
Posted to microsoft.public.word.docmanagement
p0 p0 is offline
external usenet poster
 
Posts: 254
Default text to bibliography?


Why would anyone ever need to "figure out" such a thing?


Well clearly you would, since you would add the code to your current
static text to convert it into a bibliographic source.


But why would you need to "figure it out"? (Oh, that's right, apps
don't come with instructions any more -- developers think they're
usable out of the box with no preparation.)


Your Papyrus example comes with over 500 pages of manual. Good luck in
convincing any user of reading even 25 pages before he can start using
a program, let alone 500. People just aren't patient enough anymore
for looking up things in a help file (see also my comment below on
creating new formatting styles).

That aside, the bibliographic tools of Word 2007 lack almost all
documentation; the promised SDK is almost a year overdue now (I doubt
it will ever be released); and non of the people originally working on
the academic features seem to be still doing that job nowadays.


Have a look at the, alas, defunct Mac program Papyrus (it wasn't worth
the effort for the creator to adapt it for OS X, so he just offers it
as freeware to anyone with a "legacy system," but its discussion list
was still active back when I had to abandon the Mac, two+ years ago).


The setup of this tool is totally different, this is a tool for
storing and searching bibliographic information, even entire
libraries. As a side product, it also allows you to format the output
a bit. Microsoft's tool is intended only for providing formatted
output. They don't care about maintaining a library where you can find
stuff by keywords or authors or ...


But all this is besides the point, the original topic was about adding
textual sources to your document in an automated way. I have seen some
tools for converting BibTeX or EndNote files into Word 2007 sources.
And you can always create a converter which translates your home-made
format into Microsoft's format, but you can't expect Microsoft to
support your format by default. They have a format, and you either
stick to it, or you design something else (which is pretty easy using
custom xml). The choice is up to you.


I am not talking about "formats." I am talking about plain text, plain
text that looks exactly the way published bibliographies have looked
for about a century now.


And how do they look?


They look like what the Chicago Manual of Style says they should look
like, or a reasonable approximation thereto.


I would hate to be the programmer which gets "a reasonable
approximation" of the specified input and has to write a program which
takes that approximation and translates it into the desired output.

Currently, my EndNote X1 style directory comes
with close to 3000 styles (2932 actually, but I have not downloaded
all available styles from their site). So this means, I currently have
3000 plain text versions of published bibliographies for a single
source. Are you going to write a converter which figures out which one
of those 3000 is used? Because you will have to before even starting
to parse the static text within one entry into a source.


Sounds like Microsoft-type overkill. Is that why the price is so
prohibitively high?


No idea about the price. But you can not blame EndNote for journals
and magazines not sticking to one worldwide standardized way to
display bibliographic data.

It ought to come with a dozen or so standard output styles, and the
ability to combine the building blocks plus punctuation into any
additional styles one might encounter.

Even within the same scientific journal, bibliographies tend to be
formatted differently.


Not if the copyeditors are doing their job. (I was a Manuscript Editor
at Astrophysical Journal for two years.)

It doesn't seem too much to ask that "Text to Table" could come up
with a tabular presentation, which some other module could then
convert to the "format" used by the bibliographic database: if it
knows that col. 1 is the author, col. 2 is the date, col. 3 is the
title, col. 4 is the place, and col. 5 is the publisher (that's a
basic Book entry), why can't it simply do that?


Now you are no longer talking static text, you are talking (poorly)


"Poorly"? CMS has been around since 1906 and is by far the leading
style guide in the US.


If I would ask the medical doctors what the leading US style guide
would be, they would say AMA.
If I would ask the psychologists what the leading US style would be,
they would say APA.
If I would ask the legal people what the leading US style would be,
they would say Bluebook.
If I would ask the average school kid writing a little science paper
what the leading US style would be, they would say Turabian.
And if you would ask the rest of the world what the most commonly used
style would be, they would probably say Harvard (which is also the
oldest one if I'm not mistaken).

I am not trying to say that CMS isn't important or widely used, it is
just that everybody feels that his or her style is the most important
and commonly used one while it isn't.

On a side note, of the above list, only APA and Turabian are supported
in Word 2007.

formatted text. And you would have to have a tool to map columns to
fields, since in my case, year should be the last entry (except maybe
for pages) in mybibliographyand most certainly not the second.


So you don't do author-date references in the text? Fine. That would
be Chicago's "Humanities" style.


The style I use is not supported by Word 2007 at all. I did write the
transformation stylesheet for it from scratch.

And in your case, how is your book displayed if it is an anonymous
work? I would guess col. 1 is the title, col. 2 is the date, col. 3 is
the place, and col.4 is the publisher. So even between 2 entries of
the same type, the ordering of data would be different.


No, col. 1 would be empty. (Though there are circumstances in which
the author is entered as "Anonymous"; see CMS.)


And how do you expect your static text parser to guess that column one
is empty? Once you start adding delimeters, you can just as well use
the delimeters Microsoft defined. Those delimeters being xml tags.They
might not be what you prefer as delimeters, but they are delimeters.

Maybe you don't have anonymous works, but it doesn't matter. What you
require is so specifc that you will probably be the only one using the
'import filter' anyway. The point is, Microsoft provides a set of
generic tools which works for 80% of their customers. There is no


You are again losing sight of the point. They provide _no_ tool for
going from an existingbibliographyto thebibliographydatabase.


The tools are there, they are just not obvious in use for the average
Word user.

The format of a b:Source element is entirely defined by an xml schema.
All you have to do is write a (simple) XSLT which transforms your
format into the format described by that schema. Of course, if your
format happens to be an incomprehensible static text, your XSLT will
be very complicated. But you can not blame Microsoft for that.

point for them in developing a tool which will work in a very specifc
case (yours) and therefore will target 1% or less of their customer
base. If you want one, you will have to write it yourself. They
provide a specifcation of thebibliographyformat and even provide a
programming interface (I have no experience with it). They try to help
you a long way, but the last few steps you will have to take yourself.


On the occasions when programming new reference styles has been
mentioned here, the MVPs have stated it appears to be impossibly
complicated to do so.


It is not. It is pretty basic XSLT, nothing fancy at it.

Like you said above, all people have to do is read the available
help:
* you have the open xml specification;
* you have blog articles by Microsoft people;
* you have MSDN articles describing the format (not extensively
though);
* you have 10 predefined styles, each consisting out of a couple of
1000 lines of XSLT code (that is over 10000 lines of example code)

So there is plenty of information around. Maybe it is not perfectly
organized, but it is there if you want to learn how to use it.

But the MVPs are correct, as long as there is no point and click
solution, it is too complicated for the average Word user. And no
matter how much help files you are going to add, it will remain too
complicated.

I will soon find out how greatly it respects CMS style, especially for
complicated entries.


Well I never use the style, but since Word only defines one version,
and the Chicago style is different for different research fields, I
would not get my hopes up if I were you.
  #17   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

On Aug 17, 6:47 pm, p0 wrote:
Why would anyone ever need to "figure out" such a thing?


Well clearly you would, since you would add the code to your current
static text to convert it into a bibliographic source.


But why would you need to "figure it out"? (Oh, that's right, apps
don't come with instructions any more -- developers think they're
usable out of the box with no preparation.)


Your Papyrus example comes with over 500 pages of manual. Good luck in


I did not know that! I guess I found it highly intuitive.

convincing any user of reading even 25 pages before he can start using
a program, let alone 500. People just aren't patient enough anymore
for looking up things in a help file (see also my comment below on
creating new formatting styles).


No, tht's not it at all. You cannot use a "Help" file unless you
happen to know the exact name that the writers of the "Help" have
assigned to a feature/bug. Look how many times a day it is asked here
how to get rid of the dots between words, or why there's suddenly no
vertical space between the pages.

That aside, the bibliographic tools of Word 2007 lack almost all
documentation; the promised SDK is almost a year overdue now (I doubt
it will ever be released); and non of the people originally working on
the academic features seem to be still doing that job nowadays.


What's an SDK? You use their jargon, you must be one of them!

Have a look at the, alas, defunct Mac program Papyrus (it wasn't worth
the effort for the creator to adapt it for OS X, so he just offers it
as freeware to anyone with a "legacy system," but its discussion list
was still active back when I had to abandon the Mac, two+ years ago).


The setup of this tool is totally different, this is a tool for
storing and searching bibliographic information, even entire
libraries. As a side product, it also allows you to format the output
a bit. Microsoft's tool is intended only for providing formatted
output. They don't care about maintaining a library where you can find
stuff by keywords or authors or ...


But all this is besides the point, the original topic was about adding
textual sources to your document in an automated way. I have seen some
tools for converting BibTeX or EndNote files into Word 2007 sources.
And you can always create a converter which translates your home-made
format into Microsoft's format, but you can't expect Microsoft to
support your format by default. They have a format, and you either
stick to it, or you design something else (which is pretty easy using
custom xml). The choice is up to you.


I am not talking about "formats." I am talking about plain text, plain
text that looks exactly the way published bibliographies have looked
for about a century now.


And how do they look?


They look like what the Chicago Manual of Style says they should look
like, or a reasonable approximation thereto.


I would hate to be the programmer which gets "a reasonable
approximation" of the specified input and has to write a program which
takes that approximation and translates it into the desired output.


I gather from the comments here that the "Chicago" setting doesn't
exactly mimic, or duplicate, the specifications of the CMS.

Currently, my EndNote X1 style directory comes
with close to 3000 styles (2932 actually, but I have not downloaded
all available styles from their site). So this means, I currently have
3000 plain text versions of published bibliographies for a single
source. Are you going to write a converter which figures out which one
of those 3000 is used? Because you will have to before even starting
to parse the static text within one entry into a source.


Sounds like Microsoft-type overkill. Is that why the price is so
prohibitively high?


No idea about the price. But you can not blame EndNote for journals
and magazines not sticking to one worldwide standardized way to
display bibliographic data.


There was no need for "one worldwide standardized way" before there
were online bibliographies. Electronic catalogs were developed in the
1970s, long after each discipline and each major publisher had settled
down with its preferred styles. The LC format for library cards was
universally used in the US, but it contains and omits various
categories of information that are not coextensive with those used in
bibliographies.

It ought to come with a dozen or so standard output styles, and the
ability to combine the building blocks plus punctuation into any
additional styles one might encounter.


Even within the same scientific journal, bibliographies tend to be
formatted differently.


Not if the copyeditors are doing their job. (I was a Manuscript Editor
at Astrophysical Journal for two years.)


It doesn't seem too much to ask that "Text to Table" could come up
with a tabular presentation, which some other module could then
convert to the "format" used by the bibliographic database: if it
knows that col. 1 is the author, col. 2 is the date, col. 3 is the
title, col. 4 is the place, and col. 5 is the publisher (that's a
basic Book entry), why can't it simply do that?


Now you are no longer talking static text, you are talking (poorly)


"Poorly"? CMS has been around since 1906 and is by far the leading
style guide in the US.


If I would ask the medical doctors what the leading US style guide
would be, they would say AMA.
If I would ask the psychologists what the leading US style would be,
they would say APA.
If I would ask the legal people what the leading US style would be,
they would say Bluebook.
If I would ask the average school kid writing a little science paper
what the leading US style would be, they would say Turabian.


Gotcha. Turabian is based on Chicago (which she was the editor of for
50 years or so).

The other three probably do not predate 1906.

And if you would ask the rest of the world what the most commonly used
style would be, they would probably say Harvard (which is also the
oldest one if I'm not mistaken).


I'm not aware that a style called "Harvard" is used in the US. In what
publication is it codified?

I am not trying to say that CMS isn't important or widely used, it is
just that everybody feels that his or her style is the most important
and commonly used one while it isn't.

On a side note, of the above list, only APA and Turabian are supported
in Word 2007.


I've noticed that. Kinda leaves the humanists, who tend to use MLA, up
the creek.

formatted text. And you would have to have a tool to map columns to
fields, since in my case, year should be the last entry (except maybe
for pages) in mybibliographyand most certainly not the second.


So you don't do author-date references in the text? Fine. That would
be Chicago's "Humanities" style.


The style I use is not supported by Word 2007 at all. I did write the
transformation stylesheet for it from scratch.

And in your case, how is your book displayed if it is an anonymous
work? I would guess col. 1 is the title, col. 2 is the date, col. 3 is
the place, and col.4 is the publisher. So even between 2 entries of
the same type, the ordering of data would be different.


No, col. 1 would be empty. (Though there are circumstances in which
the author is entered as "Anonymous"; see CMS.)


And how do you expect your static text parser to guess that column one
is empty? Once you start adding delimeters, you can just as well use
the delimeters Microsoft defined. Those delimeters being xml tags.They
might not be what you prefer as delimeters, but they are delimeters.


I don't know what a "static text parser" is. Did you again forget that
I've put tabs between the fields, in order to do Text to Table? (The
punctuation between each pair of fields differs through each
paragraph, so it can't go by comma or period or colon.)

Maybe you don't have anonymous works, but it doesn't matter. What you
require is so specifc that you will probably be the only one using the
'import filter' anyway. The point is, Microsoft provides a set of
generic tools which works for 80% of their customers. There is no


You are again losing sight of the point. They provide _no_ tool for
going from an existingbibliographyto thebibliographydatabase.


The tools are there, they are just not obvious in use for the average
Word user.


You just recently told me that it's _not_ possible.

The format of a b:Source element is entirely defined by an xml schema.
All you have to do is write a (simple) XSLT which transforms your
format into the format described by that schema. Of course, if your
format happens to be an incomprehensible static text, your XSLT will
be very complicated. But you can not blame Microsoft for that.


I have no idea what a "b:Source element," an "xml schema," an "XSLT
schema," however simple or complex, or an "incomprehensible static
text" may be.

point for them in developing a tool which will work in a very specifc
case (yours) and therefore will target 1% or less of their customer
base. If you want one, you will have to write it yourself. They
provide a specifcation of thebibliographyformat and even provide a
programming interface (I have no experience with it). They try to help
you a long way, but the last few steps you will have to take yourself.


On the occasions when programming new reference styles has been
mentioned here, the MVPs have stated it appears to be impossibly
complicated to do so.


It is not. It is pretty basic XSLT, nothing fancy at it.

Like you said above, all people have to do is read the available
help:
* you have the open xml specification;


I do?

* you have blog articles by Microsoft people;


I do?

* you have MSDN articles describing the format (not extensively
though);


I do?

* you have 10 predefined styles, each consisting out of a couple of
1000 lines of XSLT code (that is over 10000 lines of example code)


That's an awful lot of code.

So there is plenty of information around. Maybe it is not perfectly
organized, but it is there if you want to learn how to use it.

But the MVPs are correct, as long as there is no point and click
solution, it is too complicated for the average Word user. And no
matter how much help files you are going to add, it will remain too
complicated.


Yet somehow I didn't find Papyrus the least bit complicated -- though
apparently it's too much for you??

I will soon find out how greatly it respects CMS style, especially for
complicated entries.


Well I never use the style, but since Word only defines one version,
and the Chicago style is different for different research fields, I
would not get my hopes up if I were you.


Since the 14th ed., CMS has had two different and parallel schemata,
the old humanities style, and the author-date style favored in the
social sciences. The U of C Press does little or nothing in hard
sciences that might need other provisions. (And the 15th has grown
intolerably permissive, perhaps as those who knew Mrs. Turabian --
unfortunately I never met her -- themselves retire.)
  #18   Report Post  
Posted to microsoft.public.word.docmanagement
p0 p0 is offline
external usenet poster
 
Posts: 254
Default text to bibliography?


convincing any user of reading even 25 pages before he can start using
a program, let alone 500. People just aren't patient enough anymore
for looking up things in a help file (see also my comment below on
creating new formatting styles).


No, tht's not it at all. You cannot use a "Help" file unless you
happen to know the exact name that the writers of the "Help" have
assigned to a feature/bug. Look how many times a day it is asked here
how to get rid of the dots between words, or why there's suddenly no
vertical space between the pages.


Help files are perfectly searchable nowadays. You write about the
'dots between words' question. It is indeed a frequently asked
question.

So I just started Word pressed F1 and entered the following in the
help box "dot words" (without the quotes). And guess what, the 8th
entry in the list of results is titled: "I see dots and arrows in my
document". I click the link, and yes, it tells me exactly how to get
rid of those dots. It is there in the help, seconds away for people to
find it. Yet they still go to newsgroups to get an answer to their
question which they could easily find themselves. And they do get the
answer in here. What is more, the question and the answer in the
newsgroup are both indexed by Google. A 2 second Google search would
prevent the next person from asking the exact same question.

Note that this is not a complaint about the (quality of) posts in a
newsgroup. I merely wish to point out that people don't read help
pages or look for answers already out there. And remarks like 'but it
is easier to just ask the question' don't count. Newsgroups are slow.
They sometimes have to wait 24 hours for an answer they could have
found in 2 seconds.

And actually, your "I do?" remarks are plain examples of you not
willing to look for help on the subject. I don't blame you for not
wanting to look things up, but at least don't say the answers aren't
available.

That aside, the bibliographic tools of Word 2007 lack almost all
documentation; the promised SDK is almost a year overdue now (I doubt
it will ever be released); and non of the people originally working on
the academic features seem to be still doing that job nowadays.


What's an SDK? You use their jargon, you must be one of them!


An SDK is a Software Development Kit, and it is a term widely used in
the software business. It is most certainly not coined by Microsoft.

It doesn't seem too much to ask that "Text to Table" could come up
with a tabular presentation, which some other module could then
convert to the "format" used by the bibliographic database: if it
knows that col. 1 is the author, col. 2 is the date, col. 3 is the
title, col. 4 is the place, and col. 5 is the publisher (that's a
basic Book entry), why can't it simply do that?


Now you are no longer talking static text, you are talking (poorly)


"Poorly"? CMS has been around since 1906 and is by far the leading
style guide in the US.


If I would ask the medical doctors what the leading US style guide
would be, they would say AMA.
If I would ask the psychologists what the leading US style would be,
they would say APA.
If I would ask the legal people what the leading US style would be,
they would say Bluebook.
If I would ask the average school kid writing a little science paper
what the leading US style would be, they would say Turabian.


Gotcha. Turabian is based on Chicago (which she was the editor of for
50 years or so).


Gotcha? Does it matter which one is derived from which one? They are
different. Hence, any static text parser would have to work
differently on both styles.

The other three probably do not predate 1906.

And if you would ask the rest of the world what the most commonly used
style would be, they would probably say Harvard (which is also the
oldest one if I'm not mistaken).


I'm not aware that a style called "Harvard" is used in the US. In what
publication is it codified?


I personally don't use it. But it is used in almost all science fields
in Western Europe and the British Commonwealth (so that's including
Australia and New Zealand). So I think it is probably the most widely
used system.

I am not trying to say that CMS isn't important or widely used, it is
just that everybody feels that his or her style is the most important
and commonly used one while it isn't.


On a side note, of the above list, only APA and Turabian are supported
in Word 2007.


I've noticed that. Kinda leaves the humanists, who tend to use MLA, up
the creek.


Not really, Word 2007 supports MLA out of the box.

formatted text. And you would have to have a tool to map columns to
fields, since in my case, year should be the last entry (except maybe
for pages) in mybibliographyand most certainly not the second.


So you don't do author-date references in the text? Fine. That would
be Chicago's "Humanities" style.


The style I use is not supported by Word 2007 at all. I did write the
transformation stylesheet for it from scratch.


And in your case, how is your book displayed if it is an anonymous
work? I would guess col. 1 is the title, col. 2 is the date, col. 3 is
the place, and col.4 is the publisher. So even between 2 entries of
the same type, the ordering of data would be different.


No, col. 1 would be empty. (Though there are circumstances in which
the author is entered as "Anonymous"; see CMS.)


And how do you expect your static text parser to guess that column one
is empty? Once you start adding delimeters, you can just as well use
the delimeters Microsoft defined. Those delimeters being xml tags.They
might not be what you prefer as delimeters, but they are delimeters.


I don't know what a "static text parser" is. Did you again forget that
I've put tabs between the fields, in order to do Text to Table? (The
punctuation between each pair of fields differs through each
paragraph, so it can't go by comma or period or colon.)


Static text is text without any kind of markup or delimeters
indicating clearly were fields start and/or end. It is what you would
call text without any codes.

Once again, you want to use your tabs, Microsoft wants you to use
their tabs, their tabs being the xml tags I showed earlier. It is not
because their tabs are longer than yours (and a lot more descriptive),
that they are worse. It is all a matter of taste.

Maybe you don't have anonymous works, but it doesn't matter. What you
require is so specifc that you will probably be the only one using the
'import filter' anyway. The point is, Microsoft provides a set of
generic tools which works for 80% of their customers. There is no


You are again losing sight of the point. They provide _no_ tool for
going from an existingbibliographyto thebibliographydatabase.


The tools are there, they are just not obvious in use for the average
Word user.


You just recently told me that it's _not_ possible.


It is possible, I showed you which tags to use in the simple example I
gave in one of the previous posts.

However, to automate the entire process, the input format has to be
perfectly known. That is, no single exception can be left aside
(anonymous works, corporate authors, ...). Once you have fully defined
your format, all you have to do is provide a mapping between your
fields and the fields defined by Microsoft.

So is it possible to do it in an automated way? Yes. Is it doable? No.
There are so many versions of every style format that your
'translator' would be either just working in your specific case, or be
a huge monster which takes years to make and would even then not cover
some exceptions. Microsoft decided not to create the monster (and I
can't blame them). Instead, they decided to give you the tools to
create your translator for your specific case. But for someone without
any programming skills, those tools are too hard to use.

The format of a b:Source element is entirely defined by an xml schema.
All you have to do is write a (simple) XSLT which transforms your
format into the format described by that schema. Of course, if your
format happens to be an incomprehensible static text, your XSLT will
be very complicated. But you can not blame Microsoft for that.


I have no idea what a "b:Source element," an "xml schema," an "XSLT
schema," however simple or complex, or an "incomprehensible static
text" may be.


And that is the main problem. I do not blame you for not knowing them.
But they are available to you. If you want to learn how to use them,
you can. It is all about reading the documentation on those
technologies.

I never used XSLT before I started using Word 2007. It took me a
couple of hours to figure out how it worked and I could start creating
my own stuff. I agree that coming from a computer science background
gave me an advantage, but still ... I had to start from zero.

point for them in developing a tool which will work in a very specifc
case (yours) and therefore will target 1% or less of their customer
base. If you want one, you will have to write it yourself. They
provide a specifcation of thebibliographyformat and even provide a
programming interface (I have no experience with it). They try to help
you a long way, but the last few steps you will have to take yourself.


On the occasions when programming new reference styles has been
mentioned here, the MVPs have stated it appears to be impossibly
complicated to do so.


It is not. It is pretty basic XSLT, nothing fancy at it.


Like you said above, all people have to do is read the available
help:
* you have the open xml specification;


I do?


Yes. It is an ECMA standard (and now even an ISO standard). The
specification is open and freely available.

ECMA: http://www.ecma-international.org/pu...s/Ecma-376.htm
Microsoft: http://msdn.microsoft.com/en-us/office/aa905545.aspx
ISO: they are still finalizing the text

* you have blog articles by Microsoft people;


I do?


Yes. Probably the best example out there to get you started on
creating your own bibliographic style is
http://blogs.msdn.com/microsoft_offi...ions-1011.aspx
but there are others.

* you have MSDN articles describing the format (not extensively
though);


I do?


Yes. For example http://msdn.microsoft.com/en-us/library/bb258052.aspx


* you have 10 predefined styles, each consisting out of a couple of
1000 lines of XSLT code (that is over 10000 lines of example code)


That's an awful lot of code.


Yes, that is an awful lot of EXAMPLES. Since when is having too much
examples a bad thing? Besides, if you want an example of it being
broken down to its bare mimimum, you can check the blog article above.

So there is plenty of information around. Maybe it is not perfectly
organized, but it is there if you want to learn how to use it.


But the MVPs are correct, as long as there is no point and click
solution, it is too complicated for the average Word user. And no
matter how much help files you are going to add, it will remain too
complicated.


Yet somehow I didn't find Papyrus the least bit complicated -- though
apparently it's too much for you??


I haven't tried it, I just pointed out that it comes with a lot of
documentation. Word isn't complicated to use either if you stick to
the basic tasks. It still comes with a huge documentation though.

I will soon find out how greatly it respects CMS style, especially for
complicated entries.


Well I never use the style, but since Word only defines one version,
and the Chicago style is different for different research fields, I
would not get my hopes up if I were you.


Since the 14th ed., CMS has had two different and parallel schemata,
the old humanities style, and the author-date style favored in the
social sciences. The U of C Press does little or nothing in hard
sciences that might need other provisions. (And the 15th has grown
intolerably permissive, perhaps as those who knew Mrs. Turabian --
unfortunately I never met her -- themselves retire.)


This is the last post I make to this thread, because I feel we are
starting to just argue for the sake of argueing.

To come back to your original question: "text to bibliography?" Yes it
is possible to automate that process but highly complex and therefore
99% of the people out there will not be able to do it and the
practical answer is: No.

If you have specific questions about changing existing styles or need
help on creating your own style, just post a message to the newsgroup
and if I am around, I will try to help you.

Yves
  #19   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

On Aug 18, 5:23 am, p0 wrote:
convincing any user of reading even 25 pages before he can start using
a program, let alone 500. People just aren't patient enough anymore
for looking up things in a help file (see also my comment below on
creating new formatting styles).


No, tht's not it at all. You cannot use a "Help" file unless you
happen to know the exact name that the writers of the "Help" have
assigned to a feature/bug. Look how many times a day it is asked here
how to get rid of the dots between words, or why there's suddenly no
vertical space between the pages.


Help files are perfectly searchable nowadays. You write about the
'dots between words' question. It is indeed a frequently asked
question.

So I just started Word pressed F1 and entered the following in the
help box "dot words" (without the quotes). And guess what, the 8th
entry in the list of results is titled: "I see dots and arrows in my
document". I click the link, and yes, it tells me exactly how to get
rid of those dots. It is there in the help, seconds away for people to
find it. Yet they still go to newsgroups to get an answer to their
question which they could easily find themselves. And they do get the
answer in here. What is more, the question and the answer in the
newsgroup are both indexed by Google. A 2 second Google search would
prevent the next person from asking the exact same question.

Note that this is not a complaint about the (quality of) posts in a
newsgroup. I merely wish to point out that people don't read help
pages or look for answers already out there. And remarks like 'but it
is easier to just ask the question' don't count. Newsgroups are slow.
They sometimes have to wait 24 hours for an answer they could have
found in 2 seconds.


Irt sure looks like a complaint about the quality of posts.

When I try to use Help it's generally for something a tad more
complicated, like why I can't type Tibetan even though I have a
Tibetan font and a Tibetan keyboard installed.

Another frequent question is about those brackety things that appear
instead of a ToC or an entire Index. If you don't know the name "field
code," how do you find it in "Help"?

And actually, your "I do?" remarks are plain examples of you not
willing to look for help on the subject. I don't blame you for not
wanting to look things up, but at least don't say the answers aren't
available.


I noted that I did not know that there is a 500-page manual for
Papyrus, because I never needed a manual.

In a "Help" system, the answers are only available if you happen to
hit on the exact name used for the problem.

That aside, the bibliographic tools of Word 2007 lack almost all
documentation; the promised SDK is almost a year overdue now (I doubt
it will ever be released); and non of the people originally working on
the academic features seem to be still doing that job nowadays.


What's an SDK? You use their jargon, you must be one of them!


An SDK is a Software Development Kit, and it is a term widely used in
the software business. It is most certainly not coined by Microsoft.


You're one of them -- the people who talk about SDKs.

It doesn't seem too much to ask that "Text to Table" could come up
with a tabular presentation, which some other module could then
convert to the "format" used by the bibliographic database: if it
knows that col. 1 is the author, col. 2 is the date, col. 3 is the
title, col. 4 is the place, and col. 5 is the publisher (that's a
basic Book entry), why can't it simply do that?


Now you are no longer talking static text, you are talking (poorly)


"Poorly"? CMS has been around since 1906 and is by far the leading
style guide in the US.


If I would ask the medical doctors what the leading US style guide
would be, they would say AMA.
If I would ask the psychologists what the leading US style would be,
they would say APA.
If I would ask the legal people what the leading US style would be,
they would say Bluebook.
If I would ask the average school kid writing a little science paper
what the leading US style would be, they would say Turabian.


Gotcha. Turabian is based on Chicago (which she was the editor of for
50 years or so).


Gotcha? Does it matter which one is derived from which one? They are
different. Hence, any static text parser would have to work
differently on both styles.

The other three probably do not predate 1906.


And if you would ask the rest of the world what the most commonly used
style would be, they would probably say Harvard (which is also the
oldest one if I'm not mistaken).


I'm not aware that a style called "Harvard" is used in the US. In what
publication is it codified?


I personally don't use it. But it is used in almost all science fields
in Western Europe and the British Commonwealth (so that's including
Australia and New Zealand). So I think it is probably the most widely
used system.


Use "Help" to find out why it's called "Harvard" in the non-Harvard-
country-English-speaking world.

I am not trying to say that CMS isn't important or widely used, it is
just that everybody feels that his or her style is the most important
and commonly used one while it isn't.


On a side note, of the above list, only APA and Turabian are supported
in Word 2007.


I've noticed that. Kinda leaves the humanists, who tend to use MLA, up
the creek.


Not really, Word 2007 supports MLA out of the box.



formatted text. And you would have to have a tool to map columns to
fields, since in my case, year should be the last entry (except maybe
for pages) in mybibliographyand most certainly not the second.


So you don't do author-date references in the text? Fine. That would
be Chicago's "Humanities" style.


The style I use is not supported by Word 2007 at all. I did write the
transformation stylesheet for it from scratch.


And in your case, how is your book displayed if it is an anonymous
work? I would guess col. 1 is the title, col. 2 is the date, col. 3 is
the place, and col.4 is the publisher. So even between 2 entries of
the same type, the ordering of data would be different.


No, col. 1 would be empty. (Though there are circumstances in which
the author is entered as "Anonymous"; see CMS.)


And how do you expect your static text parser to guess that column one
is empty? Once you start adding delimeters, you can just as well use
the delimeters Microsoft defined. Those delimeters being xml tags.They
might not be what you prefer as delimeters, but they are delimeters.


I don't know what a "static text parser" is. Did you again forget that
I've put tabs between the fields, in order to do Text to Table? (The
punctuation between each pair of fields differs through each
paragraph, so it can't go by comma or period or colon.)


Static text is text without any kind of markup or delimeters
indicating clearly were fields start and/or end. It is what you would
call text without any codes.

Once again, you want to use your tabs, Microsoft wants you to use
their tabs, their tabs being the xml tags I showed earlier. It is not
because their tabs are longer than yours (and a lot more descriptive),
that they are worse. It is all a matter of taste.

Maybe you don't have anonymous works, but it doesn't matter. What you
require is so specifc that you will probably be the only one using the
'import filter' anyway. The point is, Microsoft provides a set of
generic tools which works for 80% of their customers. There is no


You are again losing sight of the point. They provide _no_ tool for
going from an existingbibliographyto thebibliographydatabase.


The tools are there, they are just not obvious in use for the average
Word user.


You just recently told me that it's _not_ possible.


It is possible, I showed you which tags to use in the simple example I
gave in one of the previous posts.


I said, a tool for going from text to database. As you pointred out, I
can do it by myself without the assistance of any tool in Word.

However, to automate the entire process, the input format has to be
perfectly known. That is, no single exception can be left aside
(anonymous works, corporate authors, ...). Once you have fully defined
your format, all you have to do is provide a mapping between your
fields and the fields defined by Microsoft.

So is it possible to do it in an automated way? Yes. Is it doable? No.
There are so many versions of every style format that your
'translator' would be either just working in your specific case, or be
a huge monster which takes years to make and would even then not cover
some exceptions. Microsoft decided not to create the monster (and I
can't blame them). Instead, they decided to give you the tools to
create your translator for your specific case. But for someone without
any programming skills, those tools are too hard to use.

The format of a b:Source element is entirely defined by an xml schema..
All you have to do is write a (simple) XSLT which transforms your
format into the format described by that schema. Of course, if your
format happens to be an incomprehensible static text, your XSLT will
be very complicated. But you can not blame Microsoft for that.


I have no idea what a "b:Source element," an "xml schema," an "XSLT
schema," however simple or complex, or an "incomprehensible static
text" may be.


And that is the main problem. I do not blame you for not knowing them.
But they are available to you. If you want to learn how to use them,
you can. It is all about reading the documentation on those
technologies.

I never used XSLT before I started using Word 2007. It took me a
couple of hours to figure out how it worked and I could start creating
my own stuff. I agree that coming from a computer science background
gave me an advantage, but still ... I had to start from zero.

point for them in developing a tool which will work in a very specifc
case (yours) and therefore will target 1% or less of their customer
base. If you want one, you will have to write it yourself. They
provide a specifcation of thebibliographyformat and even provide a
programming interface (I have no experience with it). They try to help
you a long way, but the last few steps you will have to take yourself.


On the occasions when programming new reference styles has been
mentioned here, the MVPs have stated it appears to be impossibly
complicated to do so.


It is not. It is pretty basic XSLT, nothing fancy at it.


Like you said above, all people have to do is read the available
help:
* you have the


...

read more »


  #20   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

On Aug 18, 5:23 am, p0 wrote:

I tried to move my cursor to the next spot, and my Reply sent itself!

On a side note, of the above list, only APA and Turabian are supported
in Word 2007.


I've noticed that. Kinda leaves the humanists, who tend to use MLA, up
the creek.


Not really, Word 2007 supports MLA out of the box.


Since I don't have the box yet, how would I know that?

formatted text. And you would have to have a tool to map columns to
fields, since in my case, year should be the last entry (except maybe
for pages) in mybibliographyand most certainly not the second.


So you don't do author-date references in the text? Fine. That would
be Chicago's "Humanities" style.


The style I use is not supported by Word 2007 at all. I did write the
transformation stylesheet for it from scratch.


And in your case, how is your book displayed if it is an anonymous
work? I would guess col. 1 is the title, col. 2 is the date, col. 3 is
the place, and col.4 is the publisher. So even between 2 entries of
the same type, the ordering of data would be different.


No, col. 1 would be empty. (Though there are circumstances in which
the author is entered as "Anonymous"; see CMS.)


And how do you expect your static text parser to guess that column one
is empty? Once you start adding delimeters, you can just as well use
the delimeters Microsoft defined. Those delimeters being xml tags.They
might not be what you prefer as delimeters, but they are delimeters.


I don't know what a "static text parser" is. Did you again forget that
I've put tabs between the fields, in order to do Text to Table? (The
punctuation between each pair of fields differs through each
paragraph, so it can't go by comma or period or colon.)


Static text is text without any kind of markup or delimeters
indicating clearly were fields start and/or end. It is what you would
call text without any codes.

Once again, you want to use your tabs, Microsoft wants you to use
their tabs, their tabs being the xml tags I showed earlier. It is not
because their tabs are longer than yours (and a lot more descriptive),
that they are worse. It is all a matter of taste.


And if they are n characters long, they take n times as long to type.

Maybe you don't have anonymous works, but it doesn't matter. What you
require is so specifc that you will probably be the only one using the
'import filter' anyway. The point is, Microsoft provides a set of
generic tools which works for 80% of their customers. There is no


You are again losing sight of the point. They provide _no_ tool for
going from an existingbibliographyto thebibliographydatabase.


The tools are there, they are just not obvious in use for the average
Word user.


You just recently told me that it's _not_ possible.


It is possible, I showed you which tags to use in the simple example I
gave in one of the previous posts.


That's not a tool. That's handwork that might not be inconvenient if
one had a grad student handy to do it. (My first job in Chicago was
retyping pages of a professor's book ms. each day to incorporate the
changes he'd made the previous day. Fortuntely it's a catalog of
cuneiform texts so each entry began a new ms. page.)

However, to automate the entire process, the input format has to be
perfectly known. That is, no single exception can be left aside
(anonymous works, corporate authors, ...). Once you have fully defined
your format, all you have to do is provide a mapping between your
fields and the fields defined by Microsoft.

So is it possible to do it in an automated way? Yes. Is it doable? No.
There are so many versions of every style format that your
'translator' would be either just working in your specific case, or be
a huge monster which takes years to make and would even then not cover
some exceptions. Microsoft decided not to create the monster (and I
can't blame them). Instead, they decided to give you the tools to
create your translator for your specific case. But for someone without
any programming skills, those tools are too hard to use.


Q.E.D.

The format of a b:Source element is entirely defined by an xml schema..
All you have to do is write a (simple) XSLT which transforms your
format into the format described by that schema. Of course, if your
format happens to be an incomprehensible static text, your XSLT will
be very complicated. But you can not blame Microsoft for that.


I have no idea what a "b:Source element," an "xml schema," an "XSLT
schema," however simple or complex, or an "incomprehensible static
text" may be.


And that is the main problem. I do not blame you for not knowing them.
But they are available to you. If you want to learn how to use them,
you can. It is all about reading the documentation on those
technologies.

I never used XSLT before I started using Word 2007. It took me a
couple of hours to figure out how it worked and I could start creating
my own stuff. I agree that coming from a computer science background
gave me an advantage, but still ... I had to start from zero.


No, you started from a computer science background. When I was an
undergraduate at Cornell (1968-72), there was one class in "computer
programming" available for non-majors, and it instantly filled up
every semester. When I was a grad student at Chicago (1972-76), I was
able to take Vic Yngve's class in COMIT II, a language he invented to
be like human language (and carried out a couple of publishable
projects as a result -- see ch. 5 [IIRC] of I. J. Gelb et al.'s
*Computer-Aided Analysis of Amorite* [1980]). That's not much use in
dealing with whatever programming may be today.

point for them in developing a tool which will work in a very specifc
case (yours) and therefore will target 1% or less of their customer
base. If you want one, you will have to write it yourself. They
provide a specifcation of thebibliographyformat and even provide a
programming interface (I have no experience with it). They try to help
you a long way, but the last few steps you will have to take yourself.


On the occasions when programming new reference styles has been
mentioned here, the MVPs have stated it appears to be impossibly
complicated to do so.


It is not. It is pretty basic XSLT, nothing fancy at it.


Like you said above, all people have to do is read the available
help:
* you have the


...

read more »


[not without Sending this one!]


  #21   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default text to bibliography?

On Aug 18, 5:23 am, p0 wrote:

On the occasions when programming new reference styles has been
mentioned here, the MVPs have stated it appears to be impossibly
complicated to do so.


It is not. It is pretty basic XSLT, nothing fancy at it.


Like you said above, all people have to do is read the available
help:
* you have the open xml specification;


I do?


Yes. It is an ECMA standard (and now even an ISO standard). The
specification is open and freely available.

ECMA:http://www.ecma-international.org/pu...s/Ecma-376.htm
Microsoft:http://msdn.microsoft.com/en-us/office/aa905545.aspx
ISO: they are still finalizing the text

* you have blog articles by Microsoft people;


I do?


Yes. Probably the best example out there to get you started on
creating your own bibliographic style ishttp://blogs.msdn.com/microsoft_office_word/archive/2007/12/14/biblio...
but there are others.

* you have MSDN articles describing the format (not extensively
though);


I do?


Yes. For examplehttp://msdn.microsoft.com/en-us/library/bb258052.aspx


To come back to your original question: "text to bibliography?" Yes it
is possible to automate that process but highly complex and therefore
99% of the people out there will not be able to do it and the
practical answer is: No.


Funny definition of "automate" ...

Thanks for the links. I'll look at them to see how daunting they are.
  #22   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default bibliography tool turns out to be useless text to bibliography?

On Aug 18, 5:23 am, p0 wrote:

* you have blog articles by Microsoft people;


I do?


Yes. Probably the best example out there to get you started on
creating your own bibliographic style ishttp://blogs.msdn.com/microsoft_office_word/archive/2007/12/14/biblio...
but there are others.


(1) I don't see any way to get to any other "blogs" that might have
been posted. I see that many people asked questions, and none were
answered.

(2a) One of the comments notes that it can't handle (2007a, 2007b),
and (2b) another notes that it can't handle "Smith (1997) states
that ..." vs. "It has been claimed (Smith 1997) that ..."

Both of those factors (2a) and (2b) mean that the entire tool is
utterly useless.
  #23   Report Post  
Posted to microsoft.public.word.docmanagement
p0 p0 is offline
external usenet poster
 
Posts: 254
Default bibliography tool turns out to be useless text tobibliography?


(2a) One of the comments notes that it can't handle (2007a, 2007b),
and (2b) another notes that it can't handle "Smith (1997) states
that ..." vs. "It has been claimed (Smith 1997) that ..."


(2a) http://office.microsoft.com/en-us/wo...674921033.aspx

"If you choose a GOST or ISO 690 style for your sources and a citation
is not unique, append an alphabetic character to the year. For
example, a citation would appear as [Pasteur, 1848a]."

(2b) Right click on the citation, select "Edit citation" and then
select the "Author" in the "Suppress" frame.

Yves
  #24   Report Post  
Posted to microsoft.public.word.docmanagement
grammatim[_2_] grammatim[_2_] is offline
external usenet poster
 
Posts: 2,751
Default bibliography tool turns out to be useless text tobibliography?

On Aug 19, 2:16 am, p0 wrote:
(2a) One of the comments notes that it can't handle (2007a, 2007b),
and (2b) another notes that it can't handle "Smith (1997) states
that ..." vs. "It has been claimed (Smith 1997) that ..."


(2a)http://office.microsoft.com/en-us/wo...674921033.aspx

"If you choose a GOST or ISO 690 style for your sources and a citation
is not unique, append an alphabetic character to the year. For
example, a citation would appear as [Pasteur, 1848a]."


Is Chicago either a "GOST" or an "ISO 690" style?

If I have already referenced Smith 2007, and then I find that Smith
published another article in 2007 that also needs to be cited, then I
would expect the machine to know whether it will be (a) or (b)
according to its alphabetical order in the reference list, and to
change all the existing (2007) references to (2007a) or (2007b)
accordingly.

(2b) Right click on the citation, select "Edit citation" and then
select the "Author" in the "Suppress" frame.


And then type the author's name again outside the reference?
Reply
Thread Tools
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
convert bibliography to static text mad as hell Microsoft Word Help 3 May 22nd 08 03:08 AM
MLA bibliography strandedinAK Microsoft Word Help 1 September 14th 07 05:41 AM
Bibliography phantoz Microsoft Word Help 0 June 13th 07 12:44 AM
Bibliography MurrayO Microsoft Word Help 0 March 11th 07 08:30 PM


All times are GMT +1. The time now is 08:11 AM.

Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 Microsoft Office Word Forum - WordBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Word"