Transcript of Questions and Answers for Jessica Milstead
In the relations between entities, you had the hierarchical relationship and the
synonymy. How do you deal with a term that either might point to two senior terms,
or that might have a correct position and a widely used wrong position?
Say an intergeneric hybrid, that might point to two positions, to take your first
caseis that what you mean?
Yes, or an older genus that had been split into two.
I think my Hapalemur example was probably an example of that. A taxonomist might
correct me, but I believe that Hapalemur was split out of Lemur, and Lemur remains
valid. So these two now lead up to the broader family. There could be a note
with each explaining when it was valid, what it was valid for, what was done. You
can put anything you like in a note. As to incorrect usages, my inclination is to
say that you'd probably treat them as synonymy, perhaps with a special type of synonymy
relationship. I'm trying not to be a taxonomist and limiting myself to what I've
picked up in my work. The point is that you've got possibilities with the structure
This may be a question related to Frank's: in taxonomy and systematics, we deal
with multiple classifications, literally tens of thousands of classifications, and there
are hierarchies that conflict with one another and share nodes with one another. Has
your community thought about how to structure thesauri for that kind of a problem, where
you might have multiple conflicting or overlapping classifications?
Most thesauri are built for a database or database family. We've got to pay
attention to our users, but we can say this is how we're going to do it for our particular
system today, and we make notes when there are conflicting usages. Usually a good
thesaurus will have a lot of scope notes because there will be terms that are used in
fuzzy ways. For instance, I were building a an information science thesaurus, I
might want to have the term thesaurus in it, and I would give it a scope note saying I
don't mean a Roget-type thesaurus. So we have a general and very powerful, flexible
structure, but we haven't had to deal with the particular problems you deal with in
Do the efforts with multilingual thesauri provide anything to help in answering Jim's
question about multiple conflicting and overlapping classifications?
I don't think so. Multiple language thesauri, for instance the Bibliography of the
History of Artit's thesaurus is in French and Englishis a whole area that I didn't
even touch. It would have taken me another ten minutes to give you a bare outline of
what happens with multi-lingual conceptual thesauri. Neither single nor
multi-lingual thesauri have confronted these conflicting classification problems at the
level that you have to deal with them. That's a problem that really is special to
taxonomy, I think.
I think there might also be another problemthat a lot of systematists are going
toward unranked classifications. First of all the whole hierarchy has more ranks
than we have names for, and the other is that many people don't believe the ranks have any
meaningor at least the names for the ranks don'tso what one person calls an order,
another person calls a suborder. But there's no use arguing about it if they both
agree about what's contained. So I'm wondering is it possible to represent things
like this? You mentioned that you need the heading if you're going to have something
that's floating uncertainly, in order to specify that this is a class, and this a family,
but we're not certain about the intermediates.
Are you saying that hierarchy itself is being dropped, or just the names for the levels
of the hierarchy?
Just the names for the levels of the hierarchy are being dropped in some cases.
I could design a scheme that left out those namesnot necessarily very easily, but it
could be done. For instance I built a hierarchical display there [reference to
slide; see appendix of paper]. Something else that I could do is build a classification that used numeric
codes for different levelssay separated by dots or somethingthat just showed
hierarchy. I would think you need hierarchy for organization.
We've been talking about taxonomic classifications, but I think it's worth mentioning
that another way that we're interested in thesauri is for multiple entry keys and for
descriptive information. If we've got descriptions in a structured database format,
there are a lot of less inclusive and more inclusive terms that we need to relate back and
forth so that people can get to the information they need. They might ask for hairy,
or pubescent, or three other things that maybe either more or less inclusive.
In the terminology I've been using you could call that a conceptual thesaurus of the
characteristics of organisms that are used to define them, and I certainly think that a
more or less controlled language would be good, provided people would use it. We
have "literary warrant", as we call it in library and information science,
because if people aren't using the term, then our decreeing that this is what's correct
isn't going to help.
Would you say there's any truth to the statement that the person who constructs the
thesaurus determines what the hierarchy is, and until someone constructs another hierarchy
there's no problem about conflicting hierarchies?
Hmmm. As a person who has done some thesaurus construction, I can only say it's
not that simple in a thesaurus, and I suspect that where a systematic taxonomy is
concerned, I'd be walking into a landmine if I said yes.
I'm going to take the last opportunity for a question. What Linda [Hill] was
describing was, in a sense, a framework for distributed contribution to a single
compilation of geographic names. Would it be possible for us to construct something
that would enable distributed contribution to a single repository for taxonomy?
Yes, if there is a person or group of people who can keep the structure working.
This [thesaurus] is a structure, not just a list of terms. There actually is a very
good example out therethere are other examples, but the best known and oldest one is
the ERICthe Educational Research and Information Centerthesaurus. It has a
central thesaurus coordinator, and a committee, but the terms come out of the various ERIC
clearinghouses. There are in the neighborhood of 20 of them, each in a subject
specialty. They are the ones who do the indexing for the big ERIC database, and they
submit terms, which via a committee and more general oversight, are integrated into the
thesaurus. I don't think you can make a thesaurus work, in fact I'm dubious that you
could make any kind of information structure work, without some form of centralized
oversightif you just have people throwing stuff inbecause you've talked about the
varying levels of skill and ability. I think that's a safe statement to make
generally, but certainly anytime there's an overall structure, somebody has to keep that