FROM : Scott Ellsworth
DATE : Sat Apr 30 00:43:55 2005
On Apr 29, 2005, at 3:29 PM, John Timmer wrote:
> Most sources of author information (ie - PubMed) don't track
> individual authors, so he'll essentially have no way of identifying
> when two authors are identical.
When we wrote such a tool, we parsed the pubmed author list, and
noted that Andrews J A could match a number of different authors.
Andrews J A was thus a separate author in our databases than John
Alan Andrews. A search for Andrews J A would return both.
> The second thing is that, although relating authors to the paper is
> easy, retaining the order of the multiple authors for a publication
> is absolutely essential. There has to be some way of
> reconstructing an ordered list out of that relationship set.
The best way I found was to store an index set.
Paper -> author_index_set ->> authors
If author_id is the unique author identifier, author_index_set rows
consist:
author_index_set_id
author_id
author_order
For CD/WO-like systems where the pk is not user-serviceable data, we
allow the system to add a seperate pk column. For other tools,
author_index_set_id and author_id act as a composite key.
Scott
DATE : Sat Apr 30 00:43:55 2005
On Apr 29, 2005, at 3:29 PM, John Timmer wrote:
> Most sources of author information (ie - PubMed) don't track
> individual authors, so he'll essentially have no way of identifying
> when two authors are identical.
When we wrote such a tool, we parsed the pubmed author list, and
noted that Andrews J A could match a number of different authors.
Andrews J A was thus a separate author in our databases than John
Alan Andrews. A search for Andrews J A would return both.
> The second thing is that, although relating authors to the paper is
> easy, retaining the order of the multiple authors for a publication
> is absolutely essential. There has to be some way of
> reconstructing an ordered list out of that relationship set.
The best way I found was to store an index set.
Paper -> author_index_set ->> authors
If author_id is the unique author identifier, author_index_set rows
consist:
author_index_set_id
author_id
author_order
For CD/WO-like systems where the pk is not user-serviceable data, we
allow the system to add a seperate pk column. For other tools,
author_index_set_id and author_id act as a composite key.
Scott






Cocoa mail archive

