In effect, does this mean that any citation in the literature is
a potential taxon, the way you're handling it?
Walter Berendsohn:
Yes. Well, theoretically yes, but practically, no. We need to
exercise judgement, but the model allows it. In fact, any name in the
sense of any reference can be entered as a potential taxon. If a name
has important information linked to it, you can enter it into the
system as a new potential taxon waiting for someone to decide whether
it represents a new taxon (or circumscription), or whether it's the
same as an existing potential taxon. One of the advantages of this
model is that you can actually store all of this information until
somebody can look at it, and that the original information can be
retained beyond that for future scruteny. So the potential taxon
concept has implications for information management as well as
information retrieval. Of course, there's a big danger of inflating
the number of potential taxa, and, as I said, the person indexing the
literature has to be aware of this. On the other hand, an
application could filter potential taxa for users who are not
taxonomists, and even for some taxonomists who don't want to see all
the applications that a name has had. I think the decisive thing is
that as soon as there is important information linked to a name, the
use of the name has to be recorded somehow, and then it has to be a
potential taxon.
Frank Bisby:
Walter, as you know, I fully endorse the model and your usage of
it, with the potential taxa. What I do want to ask you, taking the
example you gave with the mosses, I believe you may underestimate the
sense in which potential taxon linking is already happening in many
of the taxonomic databases. You may not have realized this, but in
the example you gave for mosses, all of the names in your concept
synonyms had different author strings. Consequently, for instance in
the ILDIS system, or FishBase, or in many of the taxonomic databases,
exactly those same links would have been made, and the only
difference is that you've got an annotation. Again, we often have
those annotations in notes. So what I fully endorse, and what you're
talking about is the precise, correct, way of doing it, I think that
many of the well organized databases that have full synonymic
indexing do in fact have provision, or are part way towards your
taxon concept already. It could be added just by annotating the
author string. If you put the same name in twice, and put it with
sensu "X" and sensu "Y", you would be there already. So I don't think
we're so far away from it as you imagine.
Walter Berendsohn:
I agree, absolutely. The development of of this model parallels
what has gone on in the biological collections community over the
past decade. If you model something that is heavily regimented, by
de-facto international business rules or even by international codes,
like nomenclature and taxonomy, you will end up with similar results.
The potential taxon model is just a synthesis, trying to provide
access to names, concepts, misapplications, multiple classifications,
and multiple hierarchies at the same time. And you are right that the
example wasn't well taken because I didn't include any identical
names; of course you can have identical names. If I had expanded the
discussion a little more I could have shown that the model supports
misapplications in the name itself, because there were quite a few
author citations in the example I gave. [Frank Bibsy off mic: "But
again those are in annotations in the author strings."] You're
absolutely right, taxonomists have been doing this for a long, long
time, so we would hope that completely annotated records would
contain this information. The potential taxon concept just provides
an effective way to store information that was previously
unstructured.
Stan Blum:
The point that I would like to add this is that one of the
critical implications of this is that people who are doing
identifications need a mechanism to link specimens and observations to
not just the name, but to the concept. That's not something
we're doing in natural history museums or probably anywhere else in
the field.
Stuart Nelson:
I wonder if you would talk a little bit about the definitional
attributes, about how it is that you're going to represent the
differences in these... I think the technical word for these things
that are almost the same are plesionyms. I wonder how you're going to
represent that difference, so you can represent things in multiple
hierarchies.
Walter Berendsohn:
Well, we could go on quite a while about the model; it is
published. I think you might be referring to two things now; the
relationship between names that refer to the same thing, and multiple
hierarchies, which taxonomists might think of as competing or
alternative classifications. I didn't even mention hierarchies here
because that would have extended my talk quite a bit, but if you are
talking about specifying the nature of the relationship between names
or potential taxa, the attributes you need are specified in the
model.
Stuart Nelson [off mic]:
I think you would want to represent on what basis the
differentiation is made. Can you specify the definitional attributes
of one taxon versus another?
Walter Berendsohn:
These defining characteristics are, in fact, what the taxonomist
includes in his or her taxonomic description or revision.
Unfortunately, we haven't yet arrived at a universally accepted way
to structure that data and to capture it in databases. For now, we
have to be pragmatic. We need to keep the scope of work practical, so
we can't record every slight difference in taxonomic treatments; no
two treatments of a taxon will be equal. So we have to have a cutoff
point, and that's the decision of the taxonomist who is trying to
record and represent the consensus view we are talking about. In
other cases it may not be possible to arrive at a single consensus
view and multiple classifications will need to be represented. The
model can accommodate several parallel accepted taxonomies without
any problem.
Nancy Morin:
Walter, I think in most or a lot of efforts where they're
computerizing collections information, it's only the most recent
identification that's being included and what you're saying suggests
that there is value in capturing the annotation history on a
specimen. Because that's how you're going to be able to link from one
concept to another.. that this specimen was annotated as species "X"
by one person and species "Y" by another person.
Walter Berendsohn:
Well, all the more extensive models of collection systems I know
have identification histories in them...
Nancy Morin:
Right, but I think a lot of us take short cuts and don't do that
because it seems like an added time effort.
Walter Berendsohn:
Well, this is already a rather specialized thing. If we had all
the information, it would be useful, but for now I would rather
emphasize the more practical aspects. For example, if you were to use
a determination key in one of the floras in Central America, and you
came to a species name, you might not know how it relates to current
taxonomy. If you have the capability to look up the name in the
sense of Flora Guatemala, and find that somebody has already cleared
that up -- and found that the name has been applied to something
completely different, or has been replaced by another name, or is a
synonym, or whatever -- you could retrieve that information from a
system like this. This is the important part, to make use of all the
knowledge, paper-based or even computerized identification keys, that
are out there. This was one of my approaches to doing floristic work.
If you're working with a thirty year old flora, using potential taxa
allows you to get a checklist that captures changes in taxonomy and
leads you precisely to the currently accepted view.
Paul Morris:
Let me just underline your point about the sociology and the
phrase "authority files." A couple of years ago I was compiling a
data model for invertebrate paleontology collections and I had to be
exceedingly careful in the language I used in talking to collections
managers -- that this was not a standard being imposed upon the
community. The systematics community as a whole and surprisingly the
museum community as well are very resistant to concept of standards
being imposed on them and I think we need to be careful about what
language we take to them in talking about powerful tools that people
can use, powerful resources that people can use.
Roy McDiarmid:
Just some guidance from you. What this almost demands is moving
from a comfortable published hard-copy chrysonomy, or taxonomy, or
whatever, and moving to an electronic format. What sorts of issues or
aguments do you have in your arsenal, so to speak, to counter the
traditional standard viewpoints of: there's no way that anyone will
pay for this in terms of publication. [WB: Publication of written or
publication of electronic...] Well I think that the argument is that
it's going to have to go to an electronic format. Currently people
would argue that hardcopy published format is in fact the acceptable
way that most taxonomists operate. You have to shift that whole
perspective to an electronic view, and I just wonder what arguments
do you find successful in convincing colleagues that that is in fact
the direction to move.
Walter Berendsohn:
Well, in fact, from personal experience I've noted that logical
arguments usually don't work. The most far-reaching success I've seen
was with a demonstration of the Flora Iberica on CD-ROM, which is of
course a printed publication that was... well it was converted to
something like a database (it was by no means what I would consider a
descriptive structured database) -- but it was kind of an eye-opener
for many traditonal-minded taxonomists in the audience. It convinced
them that there is something really good about electronic
publication, for example the multiple search and retrival options,
ease of correction, etc. But still, one issue that has not been
solved is the question of credit. If I give a presentation here, and
it gets published on the web, it still doesn't get me much credit, as
opposed to a printed article in a journal. The moment this situation
changes, things will change in the electronic publishing arena as
well.