Many collections managers, particularly those in small institutions, have been confronted with an administrator who asks: "Why don't we just digitize everything? Then we wouldn't have to worry about preservation!"
The short answer to this question is that while digitization is an excellent tool for researcher access, it cannot completely capture all information about an artifact. Nonetheless, digitization is a valuable option in the preservation toolkit.
Digitization is the process of making a digital copy of a non-digital object, or the conversion of analog information to digital information. Books, audiotapes, microfilm, and even three-dimensional objects can be digitized. It is important to distinguish between digitization and digital preservation, a topic that will be reviewed later in this session: digitization is the act of creating a digital copy; digital preservation is the management of digital content over time. For a more in-depth look at these frequently confused terms, see the blog post Digitization and Digital Preservation: Questions Persist at The Signal, the digital preservation blog of the Library of Congress.
Nearly every cultural heritage institution today has a digitization initiative, albeit with varying scales and with goals. Given the focus on digitization within the cultural community, it makes sense to look in more detail at the major issues involved in digitizing collections to see what can be learned from existing digitization efforts.
Key challenges of digitization include:
- knowing where to look for current standards and guidelines;
- navigating the decision to set up in-house digitization operations vs. outsourcing digitization to a contract vendor (more on this topic later in this session in Managing Reformatting and Digitization);
- creating digital objects worthy of the effort and expense required to preserve them;
- describing those digital objects in order to facilitate access;
- finding or developing systems for the long-term management of those digital objects; and
- staying aware of changing technology considerations, including migration of formats as standards develop and hardware becomes obsolete.
Unlike preservation microfilming and photocopying, standards for digitization are ever-evolving and will continue to change as hardware, software, and user expectations evolve.
There are, however, a number of projects and publications that have set forth best practices for creating high-quality digital copies of analog formats. If you choose to undertake a digitization project, you will need to create your own technical guidelines that reflect your institution's needs and your project's goals.
Some current digitization resources include:
- The Preservation Guidelines for Digitizing Library Materials from the Library of Congress lists considerations and requirements for preservation digitization projects.
- The Federal Agencies Digitization Guidelines Initiative (FADGI) provides guidelines for a number of still and audio-visual formats.
- The Minimum Digitization Capture Recommendations from the Preservation and Reformatting Section (PARS) of ALA /ALCTS provides guidelines for cultural heritage institutions digitizing content with the objective of producing a sustainable product that will not need to be re-digitized.
- Cornell University's Moving Theory into Practice: Digital Imaging Tutorial. Look particularly at "Section 1: Basic Terminology," "Section 2: Conversion," and "Section 3: Quality Control" to get a sense of the issues to be considered when deciding on technical requirements for image capture (commonly referred to as "benchmarking").
See Additional Resources for links to other online resources that provide guidelines for creating digital collections.
Despite the possible variations in technical requirements, there are several standard steps in digital imaging. As with other reformatting methods, most institutions will find digitization too complex and expensive to undertake in-house; contracting with a vendor experienced in working with historical paper-based collections is recommended.
The basic steps in the digitization of paper-based collections are as follows:
- Capture—Document(s) or other materials are captured in digital form using a scanner or digital camera. Decisions made about the desired image quality during benchmarking (e.g., the type of scan, the resolution of the scan, the bit depth) are implemented.
- Image processing—This includes image editing if necessary (e.g., compression of files, sharpening of images) and the creation of metadata (sometimes defined as "data about data"). Metadata indexes and describes the scanned materials. A table explaining the types of metadata can be found within Moving Theory into Practice: Digital Imaging Tutorial.
- Delivery—This is the process of getting the digitized content to the user. Delivery methods, file formats, file compression, and acceptable image quality will differ depending on various user characteristics. It is crucial to consider image delivery needs during project planning, not after the images have already been created.
- Quality control—This involves both initial and ongoing evaluation of whether the technical requirements for image capture, processing, and delivery are being met. At the beginning of the project, it is a good idea to digitize a representative sample of documents to be sure that all quality requirements are being fulfilled.
After digital objects are created and appropriately indexed and described, they must be stored on-line, near-line, and/or off-line. Over time, however, obsolescence of hardware and software (the technology chain used to access digital objects) becomes a major concern. If a hard disk drive survives intact for 10 or even 50 years, but no device survives that can retrieve the data, then the data has effectively been lost. A further examination of digital preservation and repository management is presented later in this session in Managing Reformatting & Digitization.