Browse Prior Art Database

The Document Architecture for the Cornell Digital Library (RFC1691)

IP.com Disclosure Number: IPCOM000002529D
Original Publication Date: 1994-Aug-01
Included in the Prior Art Database: 2000-Sep-12
Document File: 8 page(s) / 19K

Publishing Venue

Internet Society Requests For Comment (RFCs)

Related People

W. Turner: AUTHOR

Abstract

This memo defines an architecture for the storage and retrieval of the digital representations for books, journals, photographic images, etc., which are collected in a large organized digital library.

This text was extracted from a ASCII Text document.
This is the abbreviated version, containing approximately 16% of the total text.

Network Working Group W. Turner

Request for Comments: 1691 LTD

Category: Informational August 1994

The Document Architecture for the Cornell Digital Library

Status of this Memo

This memo provides information for the Internet community. This memo

does not specify an Internet standard of any kind. Distribution of

this memo is unlimited.

Abstract

This memo defines an architecture for the storage and retrieval of

the digital representations for books, journals, photographic images,

etc., which are collected in a large organized digital library.

Two unique features of this architecture are the ability to generate

reference documents and the ability to create multiple views of a

document.

Introduction

In 1989, Cornell University and Xerox Corporation, with support from

the Commission on Preservation and Access and later Sun Microsystems,

embarked on a collaborative project to study and to prototype the

application of digital technologies for the preservation of library

material. During this project, Xerox developed the College Library

Access and Storage System (CLASS), and Cornell developed software to

provide network access to the CLASS Digital Library.

Xerox and Cornell University Library staff worked closely together to

define requirements for storing both low- and high-resolution

versions of images, so that the low-resolution images could be used

for browsing over the network and the high-resolution images could be

used for printing. In addition, substantial work was done to define

documents with internal structures that could be navigated. Xerox

developed the software to create and store documents, while Cornell

developed complementary software to allow library users to browse the

documents and request printed copies over the network.

Cornell has defined a document architecture which builds on the

lessons learned in the CLASS project, and is maintaining digital

library materials in that form.

Document Architecture Overview

Just as a conventional library contains books rather than pages, so

the electronic library must contain documents rather than images.

During the scanning process, images are automatically linked into

documents by creating document structure files which order the image

files in the same way the binding of a book orders the pages. Thus,

the digital book as currently configured consists of two parts: a set

of individual pages stored as discrete bit map image files, and the

document structure files which "bind" the image files into a

document. In addition, a database entry is made for each digital

document which permits searching by author and title (i.e.,

bibliographic information). Beyond the order of the pages, the

arrangement of a physical book provides information to readers. The

title page and publication information co...