Discussion at the Taxonomic Authority Files Workshop, Washington, DC, June 22-23, 1998
[ TAF Home ] [ TAF Workshop Proceedings ] [ Presentation ]

Transcript of Questions for Walter Berendsohn

Gary Rosenberg:
In effect, does this mean that any citation in the literature is a potential taxon, the way you're handling it?
 
Walter Berendsohn:
Yes. Well, theoretically yes, but practically, no. We need to exercise judgement, but the model allows it. In fact, any name in the sense of any reference can be entered as a potential taxon. If a name has important information linked to it, you can enter it into the system as a new potential taxon waiting for someone to decide whether it represents a new taxon (or circumscription), or whether it's the same as an existing potential taxon. One of the advantages of this model is that you can actually store all of this information until somebody can look at it, and that the original information can be retained beyond that for future scruteny. So the potential taxon concept has implications for information management as well as information retrieval. Of course, there's a big danger of inflating the number of potential taxa, and, as I said, the person indexing the literature has to be aware of this. On the other hand, an application could filter potential taxa for users who are not taxonomists, and even for some taxonomists who don't want to see all the applications that a name has had. I think the decisive thing is that as soon as there is important information linked to a name, the use of the name has to be recorded somehow, and then it has to be a potential taxon.
 
Frank Bisby:
Walter, as you know, I fully endorse the model and your usage of it, with the potential taxa. What I do want to ask you, taking the example you gave with the mosses, I believe you may underestimate the sense in which potential taxon linking is already happening in many of the taxonomic databases. You may not have realized this, but in the example you gave for mosses, all of the names in your concept synonyms had different author strings. Consequently, for instance in the ILDIS system, or FishBase, or in many of the taxonomic databases, exactly those same links would have been made, and the only difference is that you've got an annotation. Again, we often have those annotations in notes. So what I fully endorse, and what you're talking about is the precise, correct, way of doing it, I think that many of the well organized databases that have full synonymic indexing do in fact have provision, or are part way towards your taxon concept already. It could be added just by annotating the author string. If you put the same name in twice, and put it with sensu "X" and sensu "Y", you would be there already. So I don't think we're so far away from it as you imagine.
 
Walter Berendsohn:
I agree, absolutely. The development of of this model parallels what has gone on in the biological collections community over the past decade. If you model something that is heavily regimented, by de-facto international business rules or even by international codes, like nomenclature and taxonomy, you will end up with similar results. The potential taxon model is just a synthesis, trying to provide access to names, concepts, misapplications, multiple classifications, and multiple hierarchies at the same time. And you are right that the example wasn't well taken because I didn't include any identical names; of course you can have identical names. If I had expanded the discussion a little more I could have shown that the model supports misapplications in the name itself, because there were quite a few author citations in the example I gave. [Frank Bibsy off mic: "But again those are in annotations in the author strings."] You're absolutely right, taxonomists have been doing this for a long, long time, so we would hope that completely annotated records would contain this information. The potential taxon concept just provides an effective way to store information that was previously unstructured.
 
Stan Blum:
The point that I would like to add this is that one of the critical implications of this is that people who are doing identifications need a mechanism to link specimens and observations to not just the name, but to the concept. That's not something we're doing in natural history museums or probably anywhere else in the field.
 
Stuart Nelson:
I wonder if you would talk a little bit about the definitional attributes, about how it is that you're going to represent the differences in these... I think the technical word for these things that are almost the same are plesionyms. I wonder how you're going to represent that difference, so you can represent things in multiple hierarchies.
 
Walter Berendsohn:
Well, we could go on quite a while about the model; it is published. I think you might be referring to two things now; the relationship between names that refer to the same thing, and multiple hierarchies, which taxonomists might think of as competing or alternative classifications. I didn't even mention hierarchies here because that would have extended my talk quite a bit, but if you are talking about specifying the nature of the relationship between names or potential taxa, the attributes you need are specified in the model.
 
Stuart Nelson [off mic]:
I think you would want to represent on what basis the differentiation is made. Can you specify the definitional attributes of one taxon versus another?
 
Walter Berendsohn:
These defining characteristics are, in fact, what the taxonomist includes in his or her taxonomic description or revision. Unfortunately, we haven't yet arrived at a universally accepted way to structure that data and to capture it in databases. For now, we have to be pragmatic. We need to keep the scope of work practical, so we can't record every slight difference in taxonomic treatments; no two treatments of a taxon will be equal. So we have to have a cutoff point, and that's the decision of the taxonomist who is trying to record and represent the consensus view we are talking about. In other cases it may not be possible to arrive at a single consensus view and multiple classifications will need to be represented. The model can accommodate several parallel accepted taxonomies without any problem.
 
Nancy Morin:
Walter, I think in most or a lot of efforts where they're computerizing collections information, it's only the most recent identification that's being included and what you're saying suggests that there is value in capturing the annotation history on a specimen. Because that's how you're going to be able to link from one concept to another.. that this specimen was annotated as species "X" by one person and species "Y" by another person.
 
Walter Berendsohn:
Well, all the more extensive models of collection systems I know have identification histories in them...
 
Nancy Morin:
Right, but I think a lot of us take short cuts and don't do that because it seems like an added time effort.
 
Walter Berendsohn:
Well, this is already a rather specialized thing. If we had all the information, it would be useful, but for now I would rather emphasize the more practical aspects. For example, if you were to use a determination key in one of the floras in Central America, and you came to a species name, you might not know how it relates to current taxonomy. If you have the capability to look up the name in the sense of Flora Guatemala, and find that somebody has already cleared that up -- and found that the name has been applied to something completely different, or has been replaced by another name, or is a synonym, or whatever -- you could retrieve that information from a system like this. This is the important part, to make use of all the knowledge, paper-based or even computerized identification keys, that are out there. This was one of my approaches to doing floristic work. If you're working with a thirty year old flora, using potential taxa allows you to get a checklist that captures changes in taxonomy and leads you precisely to the currently accepted view.
 
Paul Morris:
Let me just underline your point about the sociology and the phrase "authority files." A couple of years ago I was compiling a data model for invertebrate paleontology collections and I had to be exceedingly careful in the language I used in talking to collections managers -- that this was not a standard being imposed upon the community. The systematics community as a whole and surprisingly the museum community as well are very resistant to concept of standards being imposed on them and I think we need to be careful about what language we take to them in talking about powerful tools that people can use, powerful resources that people can use.
 
Roy McDiarmid:
Just some guidance from you. What this almost demands is moving from a comfortable published hard-copy chrysonomy, or taxonomy, or whatever, and moving to an electronic format. What sorts of issues or aguments do you have in your arsenal, so to speak, to counter the traditional standard viewpoints of: there's no way that anyone will pay for this in terms of publication. [WB: Publication of written or publication of electronic...] Well I think that the argument is that it's going to have to go to an electronic format. Currently people would argue that hardcopy published format is in fact the acceptable way that most taxonomists operate. You have to shift that whole perspective to an electronic view, and I just wonder what arguments do you find successful in convincing colleagues that that is in fact the direction to move.
 
Walter Berendsohn:
Well, in fact, from personal experience I've noted that logical arguments usually don't work. The most far-reaching success I've seen was with a demonstration of the Flora Iberica on CD-ROM, which is of course a printed publication that was... well it was converted to something like a database (it was by no means what I would consider a descriptive structured database) -- but it was kind of an eye-opener for many traditonal-minded taxonomists in the audience. It convinced them that there is something really good about electronic publication, for example the multiple search and retrival options, ease of correction, etc. But still, one issue that has not been solved is the question of credit. If I give a presentation here, and it gets published on the web, it still doesn't get me much credit, as opposed to a printed article in a journal. The moment this situation changes, things will change in the electronic publishing arena as well.