Preservation remains an ill-defined concept when applied to the development of digital library projects and collections. This preservation leaflet suggests a framework for understanding preservation in the digital context by creating a bridge from the five core principles of traditional preservation practice: longevity, choice, quality, integrity, and access. The essay describes how the purposes that preservation serves have broadened. It describes the transformation of preservation principles and outlines a perspective that reaffirms the relevance of preservation in the digital world. The leaflet is derived from two sources: the author's report Preservation in the Digital World and his presentations at a sequence of NEDCC's School for Scanning from 1996 to 1999.1
Preservation is not just for the world of paper. We know that digital imaging technology, in and of itself, provides no easy answers to the preservation question. Indeed, simply defining what preservation means in the digital imaging environment is a challenge; responding to the insight that such a definition might provide is harder still.2 The digital world poses significant challenges to, but does not eliminate the need for responsible, effective preservation activity.
When a library, archives, historical society, museum, or any other cultural institution with a preservation mandate stops experimenting with digital technology and decides to use it to improve services or transform operations, then that institution has embarked down the preservation path. Digital imaging technologies entail a tremendous investment of resources in an environment of flat budgets. The risk of loss is high — far higher than in most other preservation functions. The nearly constant swirl of product development that fuels our perceptions of change raises the stakes yet higher. Understanding where the risk lies and making an institutional commitment to lessen it is precisely what preservation in a digital world is all about.
Transforming the Purpose of Preservation
The term "preservation" is an umbrella under which most librarians and archivists cluster all of the policies and options for action, including conservation treatments. It has long been the responsibility of librarians and archivists — and the clerks and scribes who went before them — to assemble and organize documentation of human activity in places where it can be protected and used. The ethic of preservation as coordinated and conscious action to increase the likelihood that evidence about how we live, how we think, and what we have accomplished will survive, however, is a recent phenomenon. Traditional preservation as "responsible custody" is successful when the value of the evidence exceeds the cost of keeping it, when this evidence has a physical form, and when the roles of evidence creators, evidence keepers, and evidence users are mutually reinforcing.
The essence of preservation management is resource allocation. People, money, and materials must be acquired, organized, and put to work to prevent deterioration or renew the usability of selected groups of materials. Preservation is concerned largely with the evidence embedded in a nearly endless variety of forms and formats. Things are preserved so that they can be used for all kinds of purposes, scholarly and otherwise. People with the responsibility to do so have determined that some small portion of the vast sea of information, structured as collections of documents, books, collections, and other "things," has research value as evidence well beyond the time and the intentions of those who created or published it.3 This distinction between the value of the content (usually text and illustration) and the value of the evidence embedded in the object is at the heart of a decision-making process that is itself central to effective management of traditional and digital library materials.
It is possible to distinguish among three distinct but not mutually exclusive preservation applications of digital technologies, defined in part by the possible purposes that the product may serve for end users.4
- Protect Originals. The most common application of digital technologies in an archive or library is to create digital copies of sufficient quality that they can be used for ready reference in lieu of casual browsing through the original sources. Preservation goals are met because the original documents can be protected by limiting access to them. Examples include image reference files of photograph, clipping, or vertical files which permit the identification of individual items requiring closer study. The original order of the collection, or a book, is "frozen" much like microfilm sets images in a linear array. This preservation use of the technology has become a compelling force motivating archives and libraries to experiment with hardware and software capabilities.
- Represent Originals. A digital system could be built that represents the information content of the original sources in such detail that the system can be used to fulfil most, if not all, of the research and learning potential of the original documents. High-resolution systems that strive for comprehensive and complete content and seek to obtain "full information capture" based on emerging standards and best practices, fit this definition. Systems of this intermediate level of quality open new avenues of research and use and have the capability to have a transformative effect on the service missions of those who create the products.
- Transcend Originals. In a very small number of applications, digital imaging promises to generate a product that can be used for purposes that are impossible to achieve with the original sources. This category includes imaging that uses special lighting to draw out details obscured by aging, use, and environmental damage; imaging that makes use of specialized photographic intermediates; or imaging of such high resolution that the study of artifactual characteristics is possible.
Each of these applications places separate, but increasingly rigorous, demands on digital technologies. In each case, the use of an intermediate film or paper copy to facilitate the scanning process may or may not be necessary or advisable. Finally, the disposition of original sources (including undertaking preservation treatments before or after conversion) is a matter quite separate from the decision to undertake digital conversion. Ultimately, the purpose of the digital image product is driven by the uses to which it will be put, while preservation of original source documents should be determined by the preservation needs of the original sources.
Leadership in Transforming Preservation
Preservation in the digital world is one of the central leadership issues of our day. Some librarians and archivists seem to think that leadership on technological issues is a matter of establishing control through the application of standards and procedural guidelines. Others have argued that the rapid pace of technological change and the sheer complexity of the technology render librarians and archivists helpless in influencing the technological developments. Both perspectives are misleading. Those who hope to exercise control over the use of digital imaging technology in libraries and archives assume that moral persuasion can prevail in the absence of a significant market share. Those who prefer to "wait and see" how digital imaging technology shakes out, before making the administrative commitments necessary to ensure long-term preservation, shirk their responsibility to define the terms of the debate.
Preservation in the digital world must be the shared goal that leaders and followers elicit together. It is the responsibility of many people in many institutions fulfilling many roles. An understanding of the impact of this role differentiation on digital preservation action is crucial to identifying which of the many facets of digital technology we can control, which trends we may only influence, and which aspects we must relinquish any vain expectation for either control or influence.
In the past two decades, a consensus has emerged within a community of practitioners about a set of fundamental preservation principles that should govern the management of available resources in a mature preservation program. The fundamental principles of preservation in the digital world are the same as those of the analog world and, in essence, define the priorities for extending the useful life of information resources. These fundamental concepts are longevity, choice, quality, integrity, and access.
The Transformation of Longevity
The central concern in traditional preservation practice is the media upon which information is stored. The top priority is extending the life of paper, film, and magnetic tape by stabilizing their structures and limiting the ability of internal and external factors to cause deterioration. The focus on external factors led to specifications for proper environmental controls, care and handling guidelines, and disaster recovery procedures. Progress on efforts to control or mitigate the internal factors of deterioration has resulted in alkaline paper standards, archival quality microfilm, mass deacidification, and more rugged magnetic media. And yet, now that archivists and librarians have defined the issues surrounding the life expectancy of media, the very concept of longevity itself is fading as a meaningful intellectual construct for preservation.
Digital preservation has little concern for the longevity of optical disks and newer, more fragile storage media. The viability of digital image files is much more dependent on the life expectancy of the access system — a chain only as strong as its weakest component. Today's optical media most likely will far outlast the capability of systems to retrieve and interpret the data stored on them. Since we can never know for certain when a system cannot be maintained or supported by a vendor, libraries must be prepared to migrate valuable image data, indexes, and software to future generations of the technology.
Librarians can exercise control over the longevity of digital image data through the careful selection, handling, and storage of rugged, well-tested storage media. They can influence the life expectancy of the information by making sure that local budgetary commitments are made consistently at an appropriate level. Ultimately, we have no control over the evolution of the imaging marketplace, especially corporate research and development activities that have a tremendous impact on the life expectancy of the digital files we are creating today.
The Transformation of Choice
Preservation adds value through selection. Selection is choice and choice involves defining value, recognizing it in something, and then deciding to address its preservation needs in the way most appropriate to that value. Over decades the act of preservation has evolved from saving material from oblivion and assembling it in secure buildings to more sophisticated condition and value assessments on the already-collected. Preservation selection in libraries has largely been dictated by the need to stretch limited resources in as wise a fashion as possible, resulting in the dictum that "no item shall be preserved twice." The net result is a growing "virtual" special collection of items preserved with a variety of techniques, most notably by reformatting on microfilm. Selection is perhaps the most difficult of undertakings precisely because it is static and conceived by practitioners as either completely divorced from present use or completely driven by demand.
Selection in the digital world is not a choice made "once and for all" near the end of an item's life cycle, but rather is an ongoing process intimately connected to the active use of the digital files. The value judgments applied when making a decision to convert documents from paper or film to digital images are valid only within the context of the original system. It is a rare collection of digital files, indeed, that can justify the cost of a comprehensive migration strategy. Without factoring in the larger intellectual context of related digital files stored elsewhere and their combined uses for teaching and learning, preservation decision making cannot take place.
Even while recognizing that selection decisions cannot be made in a vacuum, librarians and archivists CAN choose which books, articles, photographs, film, and other materials are converted from paper or film into digital image form. Influence over the continuing value of digital image files is largely vested in the right to decide, in close coordination with the many parties interested in the decision, when it is time to migrate image data to future storage and access systems and when a digital file has outlived its usefulness to the institution charged with preserving it. What we cannot control is the impact of these ongoing value judgments on the abilities of our patrons to find and use information in digital form.
The Transformation of Quality
Maximizing the quality of all work performed is such an important maxim in the preservation field that few people state this fundamental principle directly. Instead, the preservation literature dictates high quality outcomes by specifying standards for treatment options, reformatting processes, and preventive measures. The commitment to quality standards — do it once, do it right — permeates all preservation activity, including library binding standards, archival microfilm creation guidelines, conservation treatment procedures, the choice of supplies and materials, and a low tolerance for error. The evolution of preservation microfilming as a central strategy for the bulk of brittle library materials has placed the quality of the medium and the quality of the visual image on an equal plane. In the pursuit of quality microfilm, compromise on visual truth and archival stability is dictated only by the characteristics of the item chosen for preservation.
Quality in the digital world is conditioned significantly by the limitations of capture-and-display technology. Digital conversion places less emphasis on obtaining a faithful reproduction in favor of finding the best representation of the original in digital form. Mechanisms and techniques for judging quality of digital reproductions are different and more sophisticated than those for assessing microfilm or photocopy reproductions.5 Additionally, the primary goal of preservation quality is to capture as much intellectual and visual content as is technically possible and then present that content to readers in ways most appropriate to their needs.
The image market has transformed the principle of maintaining the highest possible quality over time to one of finding the minimal level of quality acceptable to today's system users. We must reclaim image quality as the heart and soul of digital preservation. This means maximizing the amount of data captured in the digital scanning process, documenting image enhancement techniques, and specifying file compression routines that do not result in the loss of data during telecommunication. We can control standards of digital quality, just as we have done for microfilm. We can only influence the development of standards for data compression, communication, display, and output. Out of our hands are improvements in the technical capabilities of image conversion hardware and software. We risk hastening obsolescence by prematurely setting overly rigorous equipment specifications.
The Transformation of Integrity
The concept of integrity has two dimensions in the traditional preservation context — physical and intellectual — both of which concern the nature of the evidence. Physical integrity largely concerns the item as artifact and plays out most directly in the conservation studio, where skilled bench staff use water-soluble glues, age-old hand-binding techniques, and high quality materials to protect historical evidence of use, past conservation treatments, and intended or unintended changes to the structure of the item. The preservation of intellectual integrity is also based upon concern for evidence of a different sort. The authenticity, or truthfulness, of the information content of an item, maintained through documentation of both provenance — the chain of ownership — and treatment, where appropriate, is at the heart of intellectual integrity. Beyond the history of an item is concern for protecting and documenting the relationships among items in a collection. In traditional preservation practice, the concepts of quality and integrity reinforce each other.
In the digital world, a commitment to maintaining the physical integrity of a digital image file has far less to do with the media upon which the data are stored than with the loss of information when a file is created originally and then compressed mathematically or sent across a network. In the domain of intellectual integrity, structural indexes and data descriptions traditionally published with an item as tables of contents or prepared as discrete finding aids or bibliographic records must be inextricably linked and preserved along with the digital image files themselves. Preserving intellectual integrity also involves authentication procedures, like audit trails, to make sure files are not altered intentionally or accidentally.6 Ultimately, the digital world transforms traditional preservation principles from guaranteeing the physical integrity of the object to specifying the creation of the object whose intellectual integrity is its primary characteristic.
Librarians and archivists can control the integrity of digital image files by authenticating access procedures and documenting successive modifications to a given digital record. We can also create and maintain structural indexes and bibliographic linkages within well developed and well understood database standards. We also have a role to play in influencing the development of metadata interchange standards — including the tools and techniques that will allow structured, documented, and standardized information about data files and databases to be shared across platforms, systems, and international boundaries. It is vain to think, however, that librarians and archivists are anything but bystanders observing the rapid development of network protocols, bandwidth, or data security techniques.
The Transformation of Access
In spite of decades of claims to the contrary, increased access is largely a coincidental byproduct of traditional preservation practice, not its central focus. Indeed, the preservation and access responsibilities of an archive or library are more often in constant tension. "While preservation is a primary goal or responsibility, an equally compelling mandate — access and use — sets up a classic conflict that must be arbitrated by the custodians and caretakers of archival records," states the fundamental textbook in the field.7 The mechanism for ensuring access to a preserved item or collection is a bibliographic record located in local online catalogs or national bibliographic databases. In traditional preservation, access mechanisms, such as bibliographic records and archival finding aids, simply provide a notice of availability and are not an integral part of the object.
In the fifty years that preservation has been emerging as a professional speciality in libraries and archives, the intimate relationship between the concepts of preservation and access has undergone a sequence of transformations that mirror the changes in the technological environment in which cultural institutions have functioned. In the digital world, access is transformed from a convenient byproduct of the preservation process to its central motif.
Control over the access requirements of digital preservation, especially, the capability to migrate digital image files to future generations of the technology, can be exercised in part through prudent purchases of only non-proprietary hardware and software components. In the present environment, true "plug-and-play" components are becoming more widely available and our (limited) checkbooks provide the only incentive we can provide to vendors to adopt open system architectures or at least provide better documentation on the inner workings of their systems. Additionally, librarians and archivists can influence vendors and manufacturers to provide new equipment that is "backwardly compatible" with existing systems. This capability assists image file system migration in the same way that today's word processing software allows access to documents created with earlier versions. Much as we might wish otherwise, the life expectancy of a given digital image system and the requirement to abandon that system are profoundly important matters over which we have little or no control. Perversely, it seems, the commitment of a vendor to support and maintain an old system is inversely related to that vendor's ability to market a new system.
The Transformation of Preservation AND Access
- Preservation OR Access: In the early years of modern archival agencies — prior to World War II — preservation simply meant collecting. The sheer act of pulling a collection of manuscripts from a barn, a basement, or a parking garage and placing it intact in a dry building with locks on the door fulfilled the fundamental preservation mandate of the institution. In this regard preservation and access are mutually exclusive activities. Use exposes a collection to risk of theft, damage, or misuse of either content or object. The safest way to ensure that a book lasts for a long time is to lock it up or make a copy for use.
- Preservation AND Access: Modern preservation management strategies posit that preservation and access are mutually reinforcing ideas. Preservation action is taken on an item so that it may be used. In this view, creating a preservation copy on microfilm of a deteriorated book without making it possible to find the film is a waste of money. In the world of preservation AND access, however, it is theoretically possible to fulfill a preservation need without solving access problems. Conversely, access to scholarly materials can be guaranteed for a very long period, indeed, without taking any concrete preservation action on them.
- Preservation IS Access: Librarians and archivists concerned about the preservation of electronic records sometimes view the two concepts as interchangeable nouns. The act of preserving makes access possible. Equating preservation with access, however, implies that preservation is defined by availability, when indeed this construct may be getting it backwards. Preservation is no more access than access is preservation. Simply re-focusing the preservation issue on access oversimplifies the preservation issues by suggesting that access is the engine of preservation without addressing the nature of the "thing" being preserved.
- Preservation OF Access: In the digital world, preservation is the action and access is the thing — the act of preserving access. A more accurate construct simply states "preserve access." When transformed in this way, a whole new series of complexities arises. Preserve access to WHAT? The answer suggested in this report is a high quality, high value, well-protected, and fully integrated version of an original document. The content, structure, and integrity of the information object assume center stage — and the ability of a machine to transport and display this information object becomes an assumed end result of the preservation action rather than its primary goal.
A New Mandate for Digital Technology
It is impossible to come to terms with the responsibilities inherent in digital preservation without distinguishing between "acquiring" digital imaging technology to solve a particular problem and "adopting" it as an information management option. Acquiring an imaging system to enhance access to library and archive materials is now almost as simple as choosing the combination of off-the-shelf scanners, computers, and monitors that meets immediate specifications. Hundreds of libraries and archives have already invested in or are planning to purchase digital image conversion systems and experiment with their capabilities. Innumerable pilot projects demonstrate how much more challenging it is to digitize scholarly resources than the modern office correspondence and case files that drove the technology a decade ago. In time, most of these small-scale, stand -alone applications will fade away quietly — and the initial investment will be lost — as the costs of maintaining these systems become apparent, as vendors go out of business, and as patrons become more accustomed to remote-access image databases and the latest bells and whistles.
The process of converting library materials to an electronic form — a process which in many aspects is similar to the one used to create preservation microfilm — is distinct from any particular medium upon which the images may be stored at a particular point in time. This distinction allows for a continuing commitment to creating and maintaining digitized information while entertaining the possibility that other, more advanced storage media may render optical media obsolete.
Administrators who have responsibility for selecting systems for converting materials with long-term value also bear responsibility for providing long-term access. This commitment is a continuing one — decisions about digital preservation cannot be deferred in the hope that technological solutions will emerge like a Medieval knight in shining armor. An appraisal of the present value of books, manuscript collections, or a series of photographs in their original format is the necessary point of departure for judging the preservation of the digital image version. The mere potential of increased access to a digitized collection does not add value to an underused collection. Similarly, the powerful capabilities of a relational index cannot compensate for a collection of documents whose structure, relationships, and intellectual content are poorly understood. Random access is not a magic potion for effective collection management.
Conclusion: The Preservation Uses of Digital Technology
If libraries, archives, and museums expect to adopt digital imaging technology for purposes of transforming the way they serve their patrons and each other, then they must move beyond the experimental stage. Digital image conversion, in an operational environment, requires a deep and long-standing institutional commitment to preservation, the full integration of the technology into information management procedures and processes, and significant leadership in developing appropriate definitions and standards for digital preservation.
In the past three years, significant progress has been made to define the terms and outline a research agenda for preserving digital information that was either "born digital" or transformed to digital from traditional sources. "Digital preservation refers to the various methods of keeping digital materials alive into the future," according to a recent report from the Council on Library and Information Resources.8 Digital preservation typically centers on the choice of interim storage media, the life expectancy of a digital imaging system, and the expectation of migrating the digital files to future systems while maintaining full functionality and the integrity of the original digital system. PBS recently aired the film, "Into the Future," which graphically portrayed the problem of digital information and speculated on the consequences of inaction, all the while offering precious few ideas of what to do about the dilemma.
It may be premature for most of us to worry about preserving digital objects until we have figured out how to make digital products that are worth preserving. Digital imaging technologies create an entirely new form of information. Digital imaging technology is not simply another reformatting option in the preservation tool kit. Digital imaging involves transforming the very concept of format, not simply creating a faithful reproduction of a book, document, photograph, or map on a different medium. The power of digital enhancement, the possibilities for structured indexes, and the mathematics of compression and communication together alter the concept of preservation in the digital world. These transformations, along with the new possibilities they place on us, in turn, as information professionals, force us in turn to transform our library and archival services and programs.
Besser, Howard and Jennifer Trant. Introduction to Imaging: Issues in Constructing an Image Database. Santa Monica: Getty Art History Information Program, 1995. http://www.gii.getty.edu/intro_imaging/
Coleman, James and Don Willis. SGML as a Framework for Digital Preservation and Access. Washington, D.C.: Commission on Preservation and Access, 1997.
Conway, Paul. "Selecting Microfilm for Digital Preservation." Library Resources & Technical Services 40 (January 1996): 67–77.
Conway, Paul. Preservation in the Digital World. Washington, D.C.: Commission on Preservation and Access, March 1996. http://www.clir.org/cpa/reports/conway2/
Digital Imaging Technology for Preservation. Proceedings from an RLG Symposium Held March 17 and 18, 1994. Edited by Nancy E. Elkington. Mountain View, CA: Research Libraries Group, 1994.
Dollar, Charles M. Archival Theory and Information Technologies: The Impact of Information Technologies on Archival Principles and Methods. Macerata: University of Macerata Press, 1992.
Ester, Michael. Digital Image Collections:Issues and Practice. Washington, D.C.: Commission on Preservation and Access, 1996.
Fox, Edward A., et al. "Digital Libraries: Introduction," Communications of the ACM 38 (April 1995): 23–24.
Frey, Franziska. "Digital Imaging for Photographic Collections: Foundations for Technical Standards." RLG DigiNews 1 (3), December 15, 1997 http://www.rlg.org/preserv/diginews/diginews3.html#com
Gertz, Janet. Oversize Color Images Project, 1994-1995: Final Report of Phase I. Washington, D.C.: Commission on Preservation and Access, 1995.
Graham, Peter S. "Requirements for the Digital Research Library." College & Research Libraries 56 (July 1995): 331–39.
Hazen, Dan, Jeffrey Horrell, Jan Merrill-Oldham. Selecting Research Collections for Digitization. Washington, D.C.: Council on Library and Information Resources, 1998. http://www.clir.org/pubs/reports/hazen/pub74.html
Kenney, Anne R. and Stephen Chapman. Digital Resolution Requirements for Replacing Text-Based Material: Methods for Benchmarking Image Quality. Washington, D.C.: Commission on Preservation and Access, 1995.
Kenney, Anne R. and Stephen Chapman. Digital Imaging for Libraries and Archives. Ithaca, NY: Dept. of Preservation and Conservation, Cornell University Library, 1996.
Levy, David M. and Catherine C. Marshall. "Going Digital: A Look at Assumptions Underlying Digital Libraries." Communications of the ACM 38 (April 1995): 77–84.
Lynch, Clifford. "The Integrity of Digital Information: Mechanics and Definitional Issues." Journal of the American Society for Information Science 45 (December 1994): 737–44.
McClung, Patricia A. Digital Collections Inventory Report. Washington, D.C.: Commission on Preservation and Access, February 1996.
Mohlhenrich, Janice. Preservation of Electronic Formats: Electronic Formats for Preservation. Fort Atkinson, Wisc.: Highsmith, 1993.
Ostrow, Stephen E. Digitizing Historical Pictorial Collections for the Internet. Washington, D.C.: Council on Library and Information Resources, 1998. http://www.clir.org/pubs/reports/ostrow/pub71.html
Reilly, James M. and Franziska A. Frey. Recommendations for the Evaluation of Digital Images Produced from Photographic, Microphotographic, and Various Paper Formats. Report to the Library of Congress National Digital Library Project. Rochester, NY: Image Permanance Institute, May 1996. http://memory.loc.gov/ammem/ipirpt.html
Robinson, Peter. The Digitization of Primary Textual Sources. Office for Humanities Communication Publication, no. 4. Oxford: Oxford University Computing Services, 1993.
Rothenberg, Jeff. "Ensuring the Longevity of Digital Documents." Scientific American 272 (January 1995): 42–47.
Van Bogart, John W. Magnetic Tape Storage and Handling: A Guide for Libraries and Archives. Washington, D.C.: Commission on Preservation and Access, 1995.
Waters, Donald and John Garrett. Preserving Digital Information: Report of the Task Force on Archiving of Digital Information. Washington, D.C.: Research Libraries Group and Commission on Preservation and Access, May 1996. http://www.rlg.org/ArchTF/
Written by Paul Conway