Examples of Taxonomic Authority Compilation Projects
Anon. (F.):
Some of the data included distribution, and I saw some names up there like Zimbabwe and
Papua New Guinea. What do people think about updating geographic names, and how is that
going to be done?
Stan Blum:
That may be a little ahead of the agenda. Linda Hill will be speaking tomorrow about
gazetteers, in the sense of authority files for place names and spatial objects that get
associated with place names (i.e., digital footprints). So ... [gap due to tape change]
Nancy Morin:
[...] In taxonomy, I think that's actually a very complex question for us; and the
complexity isn't so much should we just change the name because the name of the country
has been changed. We are very historically connected and so it's important for us to know
what the original name on the label was, that tells where the plant or organism was
collected. Those geographical boundaries are not always as easily moved back and forth, so
that you can still get an idea of what the distribution of the organism is. And so it's a
kind of metadata and philosophical problem than just a data management problem.
Frank Bisby:
I think a key element in that discussion is this question of raw data versus synthesized
data. So when we're discussing geography, as with taxonomy, there's the question of the
original names, the original specimens, the original publications. In the case of
geography there's the question of the original records that the particular organism was at
a particular place at a particular time. That's very different from the synthesized data
or knowledge, which is created by peoplenot observedand lists species present
in countries, as in a database, such as the one that I showed.
Bruce Collette:
We've discussed a lot of the problems in building databases, or putting together these
databases. Bill Eschmeyer says it cost a million dollars for fishes. I don't know what
we've spent so far in ITIS, but there's another question we have to deal with before we're
through this workshop and that is "Who's going to maintain these?" The Bill
Eschmeyer-s of the world are not numerous, and once it's done to that level it's not as
big a job to maintain it, but somebody has to maintain it. Frank says the professional
organizations are a possibility. Well, the professional organizations, at least the
ones I'm familiar with, don't have the funds to hire people to add the names, make the
changes, and things of this nature. So even if we can build these, we have a question of
sustainability, we have a question of funding. As Frank has alluded to, somebody is going
to have to pay to hire people to maintain them. So as we go forward, let's not just think
of building them, but of the next problem of maintaining them.
Gary Rosenberg:
I think that that's getting to the issue that we really need to change the way
biologists do business. In other scientific disciplines, like physics or chemistry, new
theories accommodate old ones and get built into their way of doing business. But with us,
species are our hypotheses, our theories, and we don't have any way of synthesizing except
through projects like these. If we change the way people view these projects, rather than
their being secondary compilations that aren't true research efforts, if they become
really part of what biologists and systematists are supposed to do, then we can find
better funding for them. But it will really take a paradigm shift before people realize
the value of these kinds of databasesas integral to scientific research.
Bill Eschmeyer:
One problem I see is that my generation has those three-by-five cards, and a lot of
information is stored there. What I would really envision would be getting the funds for
an organization where technicians can be assigned to these workers around the world, and
pick their brain, and enter the information from their cards. ZooRecord is a big help, but
it started a lot later that this system [taxonomy], and we need a world-level
organization, like the UN or World Bank, to put up the money to let an organization
station technicians with the specialist and gather that information before it's lost. Some
kind of approach like that would be best. Or put together a team that would have some
library people, programmers, a linguista real team; it would be big money 15-20
million dollarsbut that's not real big money to some of these organizations. That's
the kind of approach that I would envision, if we really want to do this right. ... and a
geographer, and a few others, some data entry people, some typists.
Jim Beach:
So far this morning we've heard about three different kinds of projects, at least in my
classification. One is the end-user projects out there that are trying to meet a
particular needin some cases on a global basis. We've had a description of one
technology-driven project, where the Plant Name Index is going to be on the leading edge
for a while and try to push the envelope and see what kind of applications can be layered
on top of that. But there's a third area that hasn't been addressed in terms of
professional activities, and that is standards development, per se. I think
that's a big difference between our two communitiesthat we do not have the
professional infrastructure at the nearly the same scale in the museum world, at least the
natural history museum worldthat the libraries have. I'd like to hear more about
actual standards development and opportunities for that and for maintaining the standards.
For example it's quite clear that we have a lot projects working with species names, but
we have no formalI better be careful hereI don't think there are any widely
accepted standards for description of species names, or even of specimens in museums.
There have been several runs at that problem, but for various reasons they haven't been
maintainedI think largely due to funding. There hasn't been global or national
maintenance of standards in the museum community. I just think that's a terrific
problemone that we'll solve, but one that we've got to address directly if we're
going to leverage species names projects and other kinds of databases off of each other.
Chris Thompson:
I can't remember who mentioned the word tension. Among the systematists there are a
number of tasks that the diminishing few have to do. One of them is to manage species,
which means manage their names. Now we've just been talking about a third, and I can talk
from personal experience about maintaining one of the standards, which is called that
"Red Book," the International Code of
Zoological Nomenclature. I will tell you, the bottom priority of all the zoologists is
maintaining standards like that. Essentially what drives them is peer recognition, which
comes from the discovery of new species, and the development of new classifications.
Secondarily it's the species names and the nomenclature, and the last is the standards.
They just don't have time for them.
Larry Speers:
We've talked about funding and I think we're coming into a real cultural change in the
biological community. Over the last few years I've been involved with the OECD Megascience
Forum [for example, see: OECD Activities
Related to Biological Diversity, Global
Biodiversity Information Facility (GBIF), and Implementation of
GBIF], and when one of my colleagues phoned up Canada and said: "Why hasn't
Canada responded as an OECD country to participate in this megascience forum?" From
Industry Canada the response was: "Biology is not a megascience." That's the
vision that many of the bureaucrats have. Chris Thompson and I have been playing around
with an idea, although I don't have numbers for it, I believe that if you plot the size of
things studied, from sub-atomic particles to the universe, against the funding in
scienceit's a U-shaped curve, with high-energy particle accelerators and Hubble
telescopes being at the ends of the spectrum. I believe that somebody quoted that the last
subatomic particle cost 50 billion dollars to discover. Robert May said that there was
more spent on the repair of the Hubble telescope than was spent on biodiversity research.
If you plot on the same curve the impact on human health and wealth, it's an inverse
funding curve, unless you can hypothesize an asteroid hit or cold fusion. We've got to
start thinking in the kinds of megascience that can turn biology and allow us to look at
those things. Bill mentioned 20 million dollars. In the high-energy physics community,
that isn't large amounts of money. To start getting the standards to communicate, to meet
our needs in biodiversity information, we have to start looking at larger pots, larger
funds to do that. Can biodiversity become our Manhattan Project? Can it become our
Moon-Shot?
Adam Schiff:
I just wanted to comment on standards from the library side. We seem to make it a big
priority. In fact, one of the most prestigious things that you can do professionally is to
serve on the standards committees that are involved in writing the rules and maintaining
the rules. We've made that a very high priority, professionally.
Some web sites for standards organizations in the library
community:
I'd like to chip in there and support Jim Beach very strongly in saying that what we
need is more work on standards. I think Jim and I may have a very slightly different
perspective in that he and I first met an an organization called TDWG, the Taxonomic Databases Working Group, which was started in
1985, at the Conservatory Jardim Botanique in Geneva, and has attempted continuously from
then 'til now to trigger operations that would lead to standards. A few of them do exist. A few of them
have been mentioned in this meeting already, such as the authors list compiled by Dick
Brummit and Powell, at Kew. That was one of the standards that came out of those
meetings. There are others describing names... The geography system used by many of the
botanical databases around the world is a TDWG standard, and so on. But having said all
that, I thoroughly agree with Jimthat what's needed is a much greater resource and
investment in those. When Jim Zarucchi at Missouri Botanical Garden was secretary of that
organization and I was chairman at the University of South Hampton in England, we made
attempts to get grants to sponsor some of those standards, and not only was it a complete
failure to fund them, but we couldn't even get the major institutions at that time to back
the grant application, because in each case they would have been in competition with
grants they wanted for more immediate purposes, as it were. So I agree with Jim, and I
actually think that our time has come, if you like. TDWG was actually ten years ahead of
itself, and I think that the mood has changed, and that we should now pursue that very
thoroughly.
Walter Berendsohn:
I think I agree with that. The mood has changed. Although perhaps Canada has not
made the move yet, most of the other OECD countries have at least given signals that
biology is recognized as megascience, and even biological informatics is megascience for
them. Actually, the action taken by the megascience forum working group on biological
informatics, is a complete turn-around on some ways. I remember in one of the first
sessions I said that we need infrastructure, that we need a body to further developments
like standardization, like developing these types of organizational standards that will
make people work together, and this was kind of... well, it wasn't booed down, but
something similar to that. "It mustn't cost any money. Member countries won't agree
to it, if it really costs money." And this has gone completely around to a proposal
that actually sets up a secretariat to organize these things. So I think there is really
new consciousness of the problems in biological informatics and biological information,
and the importance to organize that. I hope this will filter through to individual
governments. I think it's a positive sign that, of the few working groups in the
megascience forum, the biodiversity working sub-group was selected to be presented to the
ministerial meeting next year.
Joan Swanekamp:
Going back to the libraries' acceptance of standards: we've had standards probably for
the last hundred years, but I think it's been the last thirty years that the economics
have really driven those standardsthe fact that we were doing identical work in one
library after another, after anotherand the recognition that if we could share that
work in a much more effective way we could really make some headway. The fact that we
needed standards share that work, MARC standards for our machine conversion, and the fact
that thousands of libraries are sharing this data, is what has driven us to the successes
that we've hadthe need for that consistency, a community that agrees on where those
standards need to be adjusted. Our cataloging operations are still very, very expensive,
but if it wasn't for the fact that we're sharing as much work as we are, we'd be many,
many years behind.
Stan Blum:
I think we need to wrap up this discussion, but from what I've heard, at least in the
later part of this period, it appears that a key recommendation might be emerging, which
is that the systematics community, in particular it's standards organizations, should be
talking to the organizations that create and maintain library standardsat least to
learn the mechanics and social processes that they're using. So this might be
recommendation for further work, or a follow-on workshop: to deal with this issue of
standards processes; perhaps hold a training, where experts in library standards processes
could talk to members of the systematics community.