Pre-Ingest – Learn @ Scholars Portal

The final step in processing your born-digital materials is getting them ready for ingest into Archivematica. Here are some things you can do to gather all of the information you will need once you begin the ingest process.

Determining the scope of the transfer package

package information by unlimicon from the Noun Project

Before you begin uploading your materials to Archivematica you will want to know what the final AIP and DIP will look like, what the scope of each transfer package will be, and how many transfers you are planning on making. A single fonds can consist of one or multiple transfer packages, depending on the nature of the fonds, its size, and how you are planning to provide access to the materials.

Planning the number and scope of the transfers you create is also important should you wish to take advantage of Archivematica’s automated workflow for uploading DIPs directly to AtoM. If using AtoM for access, each individual transfer should generally contain objects for a single level of description, such as a file or series. Archivematica will take these objects and attach them as items nested underneath the chosen description in AtoM. A DIP can also always be created by Archivematica, downloaded from storage, and uploaded to AtoM manually. Similarly, it would be possible to direct a larger set of access copies to AtoM as a DIP, and then re-arrange these appropriately in AtoM, though this may be labour-intensive depending on the extent of the DIP.

Remember that a transfer’s structure can be rearranged even after the materials have been transferred to Archivematica through the use of the appraisal tab. You can drag and drop digital objects from one or several different backlogged transfers into new directories to create an entirely restructured transfer.

Mapping descriptive metadata to DublinCore

Tag by Yuri Mazursky from the Noun Project

In preparation for ingest you will want to begin thinking about how to map your descriptive metadata to the fifteen properties of the simple DublinCore element set. This metadata should be formatted in simple DublinCore, and can be entered using Archivematica’s template or uploaded as a separate metadata.csv file. Something to keep in mind is that Archivematica’s template doesn’t allow for repeated elements but CSV import does. If you want to, for example, have multiple identifiers you will need to upload a metadata.csv file. Furthermore, additional metadata import is possible via extended DublinCore and custom fields using the CSV method.

Mapping rights metadata to PREMIS

Archivematica allows for the entry or import of rights metadata conforming to the PREMIS preservation metadata standard. PREMIS enables rights information and access restrictions to be both granular and machine-readable by defining four different kinds of rights “bases” – Copyright, Statute rights, License rights, and Other rights (such as a donor agreement) – which can be combined and applied to multiple or single digital objects at the transfer or item level. You can add PREMIS rights data for an entire transfer using Archivematica’s built-in template, which includes options for entering donor agreement and policy information. You can also include object-level rights metadata by importing a rights.csv file along with the transfer.

Gathering Submission Documentation

contracts by Llisole from the Noun Project

As part of the structure of the SIP, along with the original digital objects themselves, there is a folder within the metadata directory called “submissionDocumentation” that stores documentation relating to the acquisition and processing of the objects you are transferring. To prepare for ingest you should gather all relevant documentation that you wish to include, such as donor agreements, transfer forms, photographgs of original media, or accession records to include in the final SIP.

Considering Normalization Rules

File Type by Eucalyp from the Noun Project

Archivematica contains a registry for many file formats, including normalization rules, and during ingest it will attempt to normalize to any formats for which an associated rule exists. Your assessment of the file formats of the materials you have processed, as well as your institution’s plans for providing access, will give you an idea of whether any normalization will be needed and what actions Archivematica may perform. Outdated or proprietary formats may require more specialized normalization rules or different tools, or there may not be a tool available for normalization at all. Archivematica allows for the default rules to be edited according to your needs, as well as for the use of specialized conversion tools as part of your workflow.

There is also the option to use manually normalized files, if you have already created finished preservation and access copies, or if the tools you use cannot be incorporated into Archivematica’s workflow.

Resources

Idea by Takao Umehara from the Noun Project

A Primer on PREMIS and PREMIS Rights

A short blog post from the Bentley Historical library explaining the PREMIS data model, how PREMIS handles rights information, and how it is being used at the Bentley.

Archivematica Ingest Documentation

Archivematica’s official documentation with step-by-step explanations of the ingest process, including adding metadata and rights data, normalization processes, and arrangement using the appraisal tab.

DPC Technology Watch Report: Preservation Metadata (2nd edition)

A detailed DPC report introducing the PREMIS Data Dictionary and how PREMIS can be used as a tool for digital preservation.

Implementing Rights Metadata for Digital Preservation (Paywalled)

This chapter is another overview of PREMIS, but specifically examines how it interacts with Archivematica.

PREMIS Data Dictionary for Preservation Metadata – Version 3.0

The latest version of the PREMIS data dictionary itself.