Discussion at the Taxonomic Authority Files Workshop, Washington, DC, June 22-23, 1998
[ TAF Home ] [ TAF Workshop Proceedings ] [ Presentation ]

Transcript of Questions for Joan Swanekamp


Bill Eschmeyer:
Taxonomic literature doesn't go out of date, unlike some other disciplines in biology. Is there any effort at retrospective cataloging—getting the works of the late 1800s and early 1900s up to the standards that you require of new materials?
 
Joan Swanekamp:
Yeah there is. In cataloging, there are really two pieces: there's that descriptive part, that describes what you have in hand, then there's the heading portion of it, the names, the corporate bodies that we refer to when we're talking about authority control, and the subject analysis part of it, which is another aspect of authority control. What most of us are doing, we're converting our catalogs that go back a long way. At Yale we're just embarking on a project to convert 4.5 million titles that have not been converted yet, over the next 4-8 years (depending on who you talk to). What we're doing is putting the effort into bringing the subject terms up to date, and the headings—the names, the corporate bodies that are associated with those. We feel, and most of our peers feel the same way, that we can live with the descriptive elements even though they may not have been recorded exactly the same way as we would now, the information is still basically the same. But we are trying to bring those headings up to date. The way that we're doing it at my institution, because we have a catalog of about 10 million titles, and obviously we don't have the staff to review every single one of them, we've contracted with a vendor to run our whole file through these processes, make as many of the automated conversions as they can, to identify as many of the problems that they can't fix - but they know there's a problem for us and we've got staff that are taking a look at those. We're going to be at this for a very long time. We're spending big bucks to do it, but we know that our on-line catalog is going to be useless unless we can bring as much as possible into line with current practice. And we're bring 300 years of practices into line.
 
Jim Beach:
Joan, I'd like to ask you and Beacher perhaps a question that's perhaps not entirely related to your talk, but… There are a lot of things going on now with cataloging involving the Dublin Core metadata fields; this seems to be an area of potential relevance to the museum community because it attempts to bridge and maybe generalize your cataloging principles for other kinds of objects. Could you talk about how you view the Dublin Core in terms of its functions and how it might replace in part some of the full MARC and Cooperative Cataloging architectures you have now.
 
Joan Swanekamp:
OK, I can start. I mean we're probably talking about a couple of hours here. At Yale, we're trying in fact to coordinate some big image projects that are happening in some of our museums, our art library, where there's an interest in using not just the Dublin Core, but other metadata standards that in some cases are more analogous to cataloging codes, but diverge in a number of ways. What we've had up until now are numerous pilot projects that have been quite successful themselves, but it's that integration part that we're wrestling with right now. There are a number of people that don't see why we can't just import Dublin Core metadata records into our catalog and use those as catalog records. The view that we have had up until now – I would say that we have a developing view and it will continue to develop, so three weeks from now this could be different – but we've used that metadata as the basis of that bibliographic description, but we've enhanced it. Where we've had metadata for the most part up until now that's formulated according to Dublin Core is with some of our scientific collections and publications that are actually being created at Yale that we're trying to provide bibliographic access to. We're tending to view that, at the moment, as more like title page information, that we can then take and expand. In our catalog, at least right now, they don't mesh very well. I think there are some catalogs where they can. The big question we're wrestling with is should we try to store that outside our catalog and connect to it, or should we try to integrate it into our catalog, and we've not been able to answer that question. Does Beacher have a view on that?
 
Beacher Wiggins:
It's pretty much along what Joan was saying. All of this is so evolving for us. We have several experiments going on at the Library of Congress, for instance, trying to see how we can use the Dublin Core, and more importantly, how can we map it to the MARC record, because when we stop and think about the 20-plus million bibliographic records we have now that are MARC records, we want to be able to have those records integrate easily. I think what's going to help the community at large is, as we get more powerful systems, we at the Library of Congress for instance are just now working to install Endeavor's Voyager. That's going to offer us ways of accepting records that we can't now and still have a linked or integrated connection among the records and the data in our databases. That's part of what I was getting at when I was saying we need to have more staff engaged in this across the community so we can see how better to take advantage of it. Right now we are using it for more specialized items in collections rather than for the traditional print data that we get, which still remains the bulk of materials that we are responsible for providing access to. I think it's fair to say that none of us in the Library community have shut the door on the Dublin Core, in fact we are heavily engaged in network development and MARC standards offices there are the Library. So we definitely have a vested interest in seeing and taking on any approach that's going to make our approach faster more efficient and still provide the access. The jury's just still out and we're constantly trying to incorporate what comes down the pike.
 
Joan Swanekamp:
If could add just one other piece. Before I came to Yale, when I was at Columbia University, what we were doing was developing a relational database that was taking metadata from a number of sources and importing it into one database that we were then using to provide access to a whole range of digital objects. They're continuing along those lines, but they're running into, or identifying real problems with maintenance in that database with all this information coming from so many different sources—there's this lack of consistency or authority control that we're talking about.
 
Nancy Morin:
The words "user-friendly" didn't come up a lot when we were talking about taxonomic authority files, and it's certainly true that taxonomists can build authority files that only another taxonomist can use. It also seems like it would be possible for a cataloger to build a system that only another cataloger would be able to use. I know the digital library community is looking a lot at how any individual goes through and locates, or the thought processes by which they search through the material. Maybe we should be doing some focus group kinds of tests with the taxonomic authority file. Do you do that kind of testing, to see whether someone can actually find something once it's been cataloged?
 
Joan Swanekamp:
There have been library studies that have tried to interpret the way users use our catalogs and on-line systems, but we haven't been real good about translating that into how we catalog. I think there now is a recognition that we need to be cataloging for our users, especially because we're building systems now that don't require those users to come into the library. Our medical library says they see only 20% of the community that they serve. So we do need to do a much better job of not just providing different kinds of help, but customizing our records, and maybe that means working on our rules and our data structures that go along with those, too, to support what it is that our users need and the style with which approach those records, and our collections.
 
Stan Blum:
You were talking about trust; that people have to trust other people's decisions in the way that they have created catalog records. Do you think that could be enhanced if there were some evidence of the level of review that a record has achieved, or are there such indicators already.
 
Joan Swanekamp:
In our authority record program, these records are only contributed by a limited number of institutions and they go through a very rigorous training process and review process before they're allowed to contribute those records without review. In the bibliographic portion, we've not had good ways of doing that, but our BIBCO program is in fact is an effort to code records created according to program standards. They carry a little code in one MARC field that allows them, as one of my friends at RLIN says, to twinkle-that is they stand out from other records. When you've got a choice, the idea is that you would choose one of those records.
 
Chris Thompson:
I'll be a heretic within my community, but I was very interested to hear you talk about change in cultures, because to me, near perfection, comprehensiveness, little accountability, sounds like me. Getting 100 percent of all the names over the last 240 years, it's that last 5 percent that takes the most time. I'm amazed that you've changed to a culture that talks about "better, faster, cheaper", so the user has access now. I wonder what our community thinks about not cataloging all those names, to get checklists done quickly.
 
Joan Swanekamp:
That's a loaded question, but I think what we're trying to do is find a balance. We're not saying that high quality, and that perfection, isn't important, but we had catalog departments where people did one book a day, and that clearly was not acceptable. We're really just trying to find a balance there. In fact we have that same problem, that if we have typographical errors in crucial pieces of information, if it's formulated wrong; it's lost.
 
Stan Blum:
At this point I should say that we're well into the general discussion period for this afternoon's session, so please feel free to ask questions of any of the other speakers that we've heard from this afternoon.
 

To General Discussion 2