Welcome to the news column. Its purpose is to disseminate information on any aspect of cataloging and classification that may be of interest to the cataloging community. This column is not just intended for news items, but serves to document discussions of interest as well as news concerning you, your research efforts, and your organization. Please send any pertinent materials, notes, minutes, or reports to: Sandy Roe; Memorial Library; Minnesota State University, Mankato; Mankato, MN 56001-8419 (email: sandra.roe@mankato.msus.edu; phone: 507-389-2155). News columns will typically be available at the CCQ website (http://catalogingandclassificationquarterly.com/) prior to their appearance in print.
We would appreciate
receiving items having to do with:
Abstracts or reports of on-going or
unpublished research
Bibliographies of materials available
on specific subjects
Analysis or description of new
technologies
Call for papers
Comments or opinions on the art of
cataloging
Notes, minutes, or summaries of
meetings, etc. of interest to catalogers
Publication announcements
Description of grants
Description of projects
Announcements of changes in
personnel
Announcements of honors, offices,
etc.
RESEARCH
& OPINION
New Technology: SFX
The
Caltech Library System is currently a beta test site for an exciting new
Internet linking product named SFX.
Originating from research by Herbert Van de Sompel of Ghent University
in Belgium, SFX is a “context-sensitive reference linking solution” that allows
librarians to define local electronic collections and the way those collections
interact and are presented to users.
SFX is currently owned by Ex Libris, provider of the ALEPH integrated
library system and other applications.
SFX
extracts the metadata from a given bibliographic citation in an SFX-aware
resource and passes it through to an SFX server, where it is matched against a
set of pre-defined relationships to other resources, and returns a set of
extended services that would be available for that set of metadata. For example, a user performs a search in Web
of Science, locates a relevant article, and clicks on the SFX button next to
that citation. The user sees a new
window spawned by the browser that contains all the pre-defined relationships
for the metadata from that citation.
Depending on what the local collection allows, the user could select “Go
to the full-text of this article at the publisher’s website”, “Find local
holdings in the OPAC for this journal,” or “Find the impact factor of this
journal in the ISI Journal Citation Reports.”
Possibilities are only limited by the local collection and the
imagination of the librarians who define the services.
The
database that drives SFX includes tables that define Sources, which are
resources where a user would be starting their search; Source Services, such as
get full-text, find author, show local holdings, etc; Targets, resources where
a user would want to go; Target Services, such as get full-text, find author,
etc.; “Colli”, which are conceptual links between Sources and Targets; and
Object Portfolios that define what Sources, Targets, and Services that object
(usually a journal) is associated with.
In order for a resource to be an SFX
Source, it must be able to provide users with an SFX button on each
bibliographic citation record. This
button activates an OpenURL, a standard format proposed by SFX’s developers
(available: http://216.229.137.107/OpenURL/opeurl.html). The OpenURL encodes the set of SFX metadata
in a way that the SFX server can understand.
It may also contain instructions for fetching additional information
about the citation. The SFX server
parses the OpenURL and finds the set of conceptually related target services.
URLs
for each target are built on the fly when a service is selected. Once the user clicks on the service a simple
Perl program takes the metadata from the original citation and builds a URL
that includes the resource’s domain and its standard link-to syntax.
Most
information providers already have standard link-to syntax for their
resources. Many include standard
numbers, volume numbers, issue numbers, starting pages, or author names in the
URL. For example, Wiley Interscience’s
syntax includes the domain name and a unique journal identifier, while the
Royal Society of Chemistry’s syntax includes an abbreviated journal name
followed by the journal year and issue number. Syntaxes can either be found as
published standards (see Catchword’s syntax at
http://www.catchword.co.uk/cgi-bin/fs?pg=/liok.htm) or by simply analyzing a
given resource’s URLs.
Caltech
is currently working with a locally loaded version of ISI’s Web of Science as
an SFX source, and have been successful in creating links from citations in
that resource to the full text for over 375 journals, the local ILS (Innopac),
and PubMed. Future plans include adding
major electronic resources such as OVID, Innopac, SilverPlatter, and the major
CrossRef publishers as SFX Sources. In
addition to full-text journals and other A&I databases, targets could
include such local services as document delivery. Caltech uses the ILLiad
document delivery software and hopes to enable that resource as a target in
order to effortlessly pass bibliographic data in user initiated ILL requests.
For
more information see SFX’s homepage: http://www.sfxit.com/ and two articles by
Herbert Van de Sompel in D-LIB magazine:
http://www.dlib.org/dlib/april99/van_de_sompel/04van_de_sompel-pt1.html
http://www.dlib.org/dlib/april99/van_de_sompel/04van_de_sompel-pt2.html
John McDonald, Acquisitions Librarian
Betsy Coles, Manager of Digital Library System
Caltech Library System, California Institute of Technology
[Editor’s note:
Reader’s should note that this technology has the potential to enable the use
of already existing metadata (i.e., ISSN, SICI, author/title names, etc.) to
facilitate known item searching across heterogeneous electronic resources
(article databases, A&I services, online catalogs, etc.). While currently only in beta test at several
North American institutions like Caltech, it has been up and running at the University
of Ghent since spring 2000.]
EVENTS
From the ALA/ALCTS
Preconference: Metadata: Libraries and the Web—Retooling AACR and MARC21 for
Cataloging in the Twenty-first Century, Chicago, IL, July 6-7, 2000
This preconference included 30
speakers and was organized to present information about major metadata
standards on day one and applications of metadata on day two. Jennifer Younger gave the keynote address
providing a history of cataloging from the Greeks at Alexandria forward. She encouraged us to influence and
participate in the development of metadata schemes and left us with the phrase,
“Let 100 metadata schemes bloom.”
Session one on Methods of Providing
Access to Web Resources began with Brian Schottlaender speaking of AACR
complexities and specifically the Delsey Report, The Cardinal Principle, and
(ER) Harmonization. Rebecca Guenther
spoke on MARC21, the Dublin Core, and crosswalks between the two. The next three speakers focused on
seriality. Jean Hirons summarized the
current state of AACR2 and seriality.
Regina Romano Reynolds, Head of the National Serials Data Program,
addressed ISSN as a link to data – in the online catalog, as a link out to
other metadata, and as a link between publishers and libraries. In Struggling Toward Retrieval, Sheila
Intner asked if we were cataloging the chunks that patrons want and encouraged
the audience to expand bibliographic options for an increasingly diverse group
of users. Matthew Beacom brought
Session one to a close with his presentation on the use of AACR2R to catalog
web resources, concluding that AACR must adapt further to a pluralistic
metadata and data environment.
Session two focused on Methods of
Providing Access to Web Resources. Erik
Jul opened the session with the statement that library science has the
potential to become a guiding, leading science rather than a trailing
effort. Metadata specialists can become
consultants to the world! Norm Medeiros
followed with a summary of New York University School of Medicine’s
participation in CORC. One result was
that the school’s search engine was reconfigured to look at CORC tags,
enhancing retrieval. Lynn Marko was
next, bringing the group up to date on TEI (Text Encoding Initiative) with
Working Toward a Standard TEI Header for Libraries. Eric Miller and Diane Hillmann combined to do a presentation on
XML (eXtensible Markup Language) and RDF (Resource Description Framework)
contending that if we perceive XML and RDF as just another way to present
traditional materials, we’ve missed the point.
Hillmann encouraged the audience to extend their reach – to consider
providing access to collections and individual items either “above” or “below”
the level of granularity of books and serials.
She maintains that we can contribute an understanding of bibliographic
structure and indexing, insights into user behavior, and experience managing
big data – “let’s face the music and dance.”
After the afternoon break, Carlen
Ruschoff brought the audience up to date on various ISO standards for metadata
as well as outlining the ISO standard proposal path from initiative to
published standard. He spoke
specifically about a few ISO standards and drafts and included handout
information for many others relating to data elements (6), identifiers (8),
codes (3), character sets (14), transliteration of nonroman scripts (6), and
formats and protocols for communication and retrieval (4). Carlos Rodriguez and Laura Bayard finished
off the afternoon session with presentations on Infomine and MARCit,
respectively.
Session three which began day two
was built around the theme, Growing your own digital library at home. Four speakers described four different
projects, each in a library setting.
Elizabeth U. Mangan from the Geography and Map Division of the Library
of Congress described a scanning project of map images, the American Memory
access information assigned to the records, and the resulting functionality for
linkages and retrieval. Constance Mayer
of Indiana University provided us with a history and future goals of the
VARIATIONS Project. The project currently
provides access to over 6,000 titles of near CD-quality music from both their
OPAC (linked from the 856 field) and from the course reserve lists. URLs are established for each track. Descriptive metadata (provided by the
bibliographic record), structural metadata (such as track information), and
administrative metadata (date digitized, initials of the technician, etc.) are
created or acquired for each piece.
Beth Picknally Camden brought us back to the Dublin Core in her
presentation on the cataloging of digital video clips. The video clip collection was created for a
class, selected by both the professor and graduate students, and digitized by
the Digital Media Center. Because the
Dublin Core itself has no content standard, she walked us through their local
decisions for each Dublin Core element used.
She emphasized increased cooperation between the Digital Library and
Cataloging staff as a project benefit and encouraged each of us to get involved
with these kinds of projects at our own institutions. Finally, William Fietzer from the University of Minnesota
described interpretive encoding being added to electronic texts via TEI,
providing examples from their Women’s Travel Writing (1830-1930) collection.
Session four focused on metadata
projects apart from library settings – seven speakers in 2 hours! Diane Boehr gave us some insight on plans
for metadata at the National Library of Medicine. Stanley Blum, a zoologist, spoke on the use of metadata to
integrate biological collections and the similarities between libraries and the
natural history communities. Murtha
Baca described the work of the Art Information Task Force, which resulted in Categories for the Description of Works of
Art (CDWA). CDWA is a guideline for
the structure of art databases which could be likened to a hybrid of MARC and
AACR2 in that they contain both guidelines for data content and guidelines for
data value. She also provided
information about other metadata standards and vocabulary resources. Wendy Treadwell presented information on the
Data Documentation Initiative (DDI) and its role in Social Science Data access.
Kris Kiesling explained Encoded Archival Description (EAD). William Garrison followed with information
on the Colorado Digitization Project, including their use of OCLC’s SiteSearch
and ability for participants to contribute records in different formats
(currently Dublin Core or MARC). Brad
Eden closed this session with a description of the Instructional Management System
Standard and its use aim of enabling an open, rather than proprietary,
architecture.
The final session entitled Looking
toward the future was a panel discussion which included Clifford Lynch (CNI),
Vivian Bliss (Microsoft), and Michael Gorman (CSU-Fresno). Lynch reminded us of the whole point of
metadata – to make things more accessible – and maintained that metadata only
gets really interesting when we use it.
He challenged us to stop equating metadata just with description, but to
see it from an information discovery standpoint – to always compliment our
thinking of metadata with how it is going to be used and how it is going to be
transported.
Bliss
described her work with a team at Microsoft’s library to create a general
portal to their corporate intranet (over 2 million pages) where queries cross
several different collections with different tagging schemes and combine
results. Their group created and
maintains a metadata registry which brings organization to the company’s
diverse controlled vocabularies as well as those that come into Microsoft from
outside subscriptions to products like news feeds. Their group also markets their knowledge management expertise to
other groups within the company. She
echoed Eric Jul’s earlier comment, “If not us, then who?”
Gorman
likened the task of cataloging the web to catching lightening in a bottle,
asking what are we seeking to organize?
He presented our range of choices as 1) identify and catalog, 2)
identify and produce metadata according to some standard, 3) identify and
product metadata without standards, or 4) leave some items in the murky waters
of the net. He asked us to remember
that cataloging an item without providing for the preservation of that item is
not sufficient.
Eric
Jul, Matthew Beacom, and Brad Eden returned briefly to the podium to wrap up
the preconference, stating that our greatest day lies ahead! Let’s take action.
[The
printed proceedings of this preconference will be published and are expected to
be available at ALA Midwinter meeting, January 2001.]
From the Joint
MARBI/CC:DA Meeting held during the American Library Association Annual
Meeting, Chicago, IL, July 10, 2000
XML
and MARC: A Choice or a Replacement? was presented by Dick R. Miller, Head of
Technical Services & Systems Librarian, Lane Medical Library, Stanford
University Medical Center. After
Miller’s presentation Paul Weiss and Matthew Beacom gave formal responses to
the presentation representing MARBI and CC:DA respectively. General discussion followed.
Miller
reported that in September 1998, Lane Medical Library undertook the Medlane
Project. It involved converting catalog
records to XML for integration with other web resources – in part a reaction to
the feeling that their library information (in MARC format) was under-utilized
because of its segregation from mainstream web resources and an awareness of
the reluctance of users to search multiple resources. Lane developed sample DTDs (document type definitions) to explore
restructuring and simplifying MARC and released XMLMARC software on December 29,
1999 to demonstrate conversion feasibility.
It is freely available for noncommercial use and there are currently 300
licensees from over 40 countries. It
was developed as a feasibility study, which they believe, has been proven. The project is currently focusing on issues
related to indexing, search access, and presentation.
Related
projects include BiblioML, released in January by a French government agency
which converts Unimarc to XML; the Library of Congress’s literal mapping of
MARC to SGML from 1995 to 1998; and Logos Research Systems' MARC to XML to MARC
Converter. However, in each of these
projects the mappings are literal.
Lane’s investigation differs in that it advocates changes to MARC to
take advantage of XML's strengths and would mean a permanent change to XML
rather than another version used as an adjunct to "real" MARC. Miller identified MARC as the chief
impediment to an effective integration of the library resources with web
resources.
XMLMARC
was developed partly as a feasibility study for converting MARC data to XML,
but also to explore ways in which cataloging data could be restructured for
greater economy and elegance while still preserving content and previous
efforts. Some of the MARC problems
mentioned which could be better addressed by XML included its blurring of
description, access, and relationships; mixing data values and data properties;
excessive vs. insufficient subfields; redundancy; and character set issues.
Miller
believes that “it is possible to recast MARC, leveraging untold person-years of
effort in defining content, identifying relationships, and resolving problems
and conflicts, producing a more coherent and eloquent version using XML. This
could add luster to librarianship, engendering respect for librarians and
needed technical underpinnings at a time when the profession is facing external
as well as internal challenges.” He
suggests we do more analysis, build a model, consider transitional strategies,
and find a faster way to develop standards.
In
his response, Weiss listed a few cautions.
Storage space has been estimated at twice that of MARC. XML is a meta-language and not a single
standard; its flexibility makes standards critical to maximize its benefit –
much work would have to be done. He
encouraged consideration of XML for areas in the library not currently
standardized, such as the circulation protocol written in XML that will be out
for comment this summer, and concluded that there is a place for both.
Beacom
reminded the group that the discussion had been about how we express the
structure not about the content or fill.
We currently have an older expression of our structure (MARC), but
having a retooled structure (XML) might more adequately service the new kinds
of things we’re describing. He concurred
with Miller’s description of the weaknesses in MARC and mentioned that records
need to be able to become a cluster of related things. Beacom stated that it has been the flatness
of the file that has frustrated the resolution of the multiple version problem.
The
general discussion was favorable to exploring a move toward XML. Diane Hillmann reminded the group that our
big investment is in semantics (AACR), not in syntax (MARC) and stated that
MARC is imminently replaceable. We want
to look forward to more sophisticated linking and better support of
hierarchies, but will need to go carefully and not sacrifice what we have. John Attig encouraged the group to define
the task – adjust the structure? – adjust the semantics? – and cautioned
against ignoring legacy data. The
meeting concluded with consensus that the discussion needs to continue.
[Dick
R. Miller received an invitation to write an article on XML for Library Journal's NetConnect, which appeared in conjunction with this ALA Annual
Meeting. This article advocates not
only XML replacement of MARC formats, but also XML replacement of proprietary
"library information" formats used by ILS vendors (e.g. ILL, patron
data, circulation transactions, orders, check-in data) and predicts an XML-based
ILS in the near future. See
http://www.ljdigital.com/xml.asp. A related article on bibliographic management
at Lane Medical Library appears in Cataloging
& Classification Quarterly 30(2).]
From the CONSER At
Large Meeting held during the American Library Association Annual Meeting,
Chicago, IL, July 9, 2000
The
Joint Steering Committee will discuss “Revising AACR2 to Accommodate Seriality:
Rule Revision Proposals” prepared by Jean Hirons and the CONSER AACR Review
Task Force as well as comments received about this document at its meeting in
September 2000.
The
CONSER Task Force on Publication Patterns and Holdings reported that the
publication pattern experiment has begun use of the local OCLC bibliographic
field 891 to share publication pattern and holdings data. The experiment embeds MARC fields 853 and
863 in OCLC’s 891 allowing this data to be communicated among systems and used
for predictive check-in. In June, OCLC
record #35601086 for Heart Failure
Reviews became the first CONSER record in which 891 fields with such data
were loaded. This task force is also
analyzing responses from a survey of system vendors on their use of MARC Format
for Holdings Data; a report will be forthcoming.
ANNOUNCEMENTS
Guidelines on Subject Access to
Individual Works of Fiction, Drama, Etc., 2nd
edition is now available. Prepared by
the CCS Subject Analysis Committee subcommittee on the Revision of the
Guidelines on Subject Access to Individual Works of Fiction, (Hiroko Aikawa,
Jan DeSirey, Linda Gabel, Susan Hayes, Kathy Nystrom, Mary Dabney Wilson, Pat
Thomas, Chair), this new and revised edition will help catalogers and others in
the library apply the suggested headings to individual works of fiction, enrich
catalog entries quickly and consistently by following guidelines, satisfy
library patrons and readers by pointing them to targeted works, characters,
settings, and topics. Softcover, ISBN
0-8389-3503-6.
Music
and Media at the Millennial Crossroads: Special Materials in Today's Libraries. This joint OLAC (Online Audiovisual
Catalogers, Inc.) and MOUG (Music OCLC Users Group) Conference will be held
October 12-15, 2000 in Seattle, Washington at WestCoast Grand Hotel (formerly
Cavanaughs on Fifth Avenue). Martha Yee
(Cataloger, UCLA Film and Television Archive) and Sherry Vellucci (Associate
Professor, St. John’s University) will serve as keynote speakers. Topics for the cataloging workshops will be
computer files (taught by Iris Wolley), Internet resources (Linda Barnhart),
maps (Susan Moore and Kathryn Womble), music scores (Ralph Papakhian), realia
(Nancy Olson), sound recordings (Mark Scharff), video recordings (Jay Weitz),
and SACO (Adam Schiff). More
information can be found at http://ublib.buffalo.edu/libraries/units/cts/olac/.
Bicentennial Conference on Bibliographic Control for the New Millennium: Confronting the Challenges of Networked Resources and the Web. The Library of Congress is hosting this invitational conference on November 15-17, 2000. It is intended to bring together authorities in the cataloging and metadata communities to discuss outstanding issues involving improved discovery and access to Web resources within the framework of international standards. The conference will focus on producing recommendations that will help the Library of Congress, the framers of AACR, and the library profession develop and implement an effective response to the bibliographic challenges posed by the proliferation of Web resources. Michael Gorman will give the keynote address, From Card Catalogues to WebPACs: Celebrating Cataloguing in the 20th Century. Discussion papers include Metadata for Web Resources: How Metadata Works on the Web by Martin Dillon and Metadata Schemes for Discovery in Digital Libraries: Trends, Interactions, and Common Themes by Caroline Arms. Additional information, names and topics of other speakers and commentators, and full text for some papers is available at http://lcweb.loc.gov/catdir/bibcontrol/. A discussion list to foster constructive feedback on issues addressed by the conference papers is currently available. To subscribe, send a message to listserv@loc.gov with the message "subscribe bibcontrol [your name]".