Preservation Metadata

Where does Preservation Metadata fit in the larger world of metadata?

In 1998 the Newham Museum Archaeological Service closed. Data from over a decade of fieldwork was hurriedly saved on 239 3.5inch floppy disks and given to the Archaeology Data Service for restoration. The files on the floppies were up to ten years old; many were corrupted, lacked documentation, and had been created using now-obsolete software. After much hard work and cost, the ADS was able to preserve most of the files, and in the process identify steps toward better data preservation in the future.

 

The salient characteristic of Preservation Metadata about objects is that the information describes attributes and issues that are useful over the long-term life of the objects. The conventional view of metadata assigns them into three categories – Administrative, Descriptive, Structural. This division doesn't highlight the common purpose of preservation metadata.

The OAIS model introduces another view, distinguished by the long-term function of the metadata. Four new categories—Reference, Context, Provenance, and Fixity Information—grouped under the umbrella term Preservation Description Information (PDI), make the long-term issues explicit. A fifth category, called Representation Information, contains information about the viewers and programs needed to process particular digital objects. A sixth category, Descriptive Information, contains more ephemeral metadata—the information used to aid searching, ordering, and retrieval of the objects.

Preservation Description Information

1) Reference Information: enumerates and describes identifiers assigned to the content information such that it can be referred to unambiguously, both internally and externally to the archive (e.g., ISBN, URN).
2) Provenance Information: documents the history of the content information (e.g., its origins, chain of custody, preservation actions and effects) and helps to support claims of authenticity and integrity.
3) Context Information: documents the relationship of the content information to its environment (e.g., why it was created, relationships to other content information).
4) Fixity Information: documents authentication mechanisms used to ensure that the content information has not been altered in an undocumented manner (e.g., checksum, digital signature).

Representation Information

Representation information facilitates the proper rendering, understanding, and interpretation of a digital object's content. At the most fundamental level, representation information imparts meaning to an object’s bitstream. For example, it may indicate that a sequence of bits represents text encoded as ASCII characters and furthermore, that the text is in French. The depth of the representation information needed depends in part on the designated community for whom the content is intended.

digital preservation Preservation related metadata standards are developing across the digital preservation landscape. Recent developments include the second data dictionary from PREMIS, the NISO Technical Metadata for Digital Still Images, and METS, which is being actively taken up by a number of digital preservation initiatives. Each organization has to navigate through the changing metadata scene to consider the standards, practice, protocols, and tools that fit their digital preservation development approach and stage. As preservation metadata practices stabilize, it will be less necessary for individual organizations to devise interim approaches, but many organizations are finding it necessary to forge ahead with an eye towards community developments and standards.

 

 

0101 The technological challenge is to adapt, adopt, and develop appropriate tools and techniques (e.g., JHOVE, PRONOM, Xena) as the organization determines short-term and long-term strategies for addressing evolving preservation metadata requirements. How do community developments on preservation metadata fit into the organization’s digital preservation development plans? What will it take to make new or legacy digital objects ready for long-term preservation?

$$$$ Preservation metadata costs are beginning to shift from handwork approaches to automated processing and handling. Resources are better spent on developing effective processes and workflow than on manual operations. Open source software and tools require resources to effectively incorporate them into an organization’s digital preservation program.