Arrangement and description is the stage where the materials are brought under intellectual and physical control for the purposes of providing future access. This involves determining appropriate levels of description, establishing intellectual arrangement, and creating finding aids. When describing digital fonds these steps are largely the same, however instead of going through physical folders to examine connections and context, you will be using the information you have discovered through the identification of file formats, extracted metadata, and associated software. Your descriptions and arrangements will need to be built around the unique properties of born-digital materials, and provide future archivists and researchers with the tools and information they will need to access and understand them. Arrangement and description can take place prior to or following preservation processing with Archivematica. However, knowing that Archivematica workflows also include inserting structured rights and descriptive metadata means that performing description beforehand would make organizing this information easier. Similarly, access considerations (discussed below) would also take into account physical arrangements of files into SIPs, AIPs and DIPs.
Conducting Relevant Contextual Research/Documentation
Archival description not only draws on the content of the material being described but the context of its creation. When working with digital materials, researching that context may require different techniques than you would use with physical items. For example, in your description you will need to note the size of the materials – how many megabytes or gigabytes or terabytes that they consist of. You may also want to note how many individual files make up the fonds. This can be broken down further depending on the formats of the files you are working with and how they were created. If you are working with a collection of web pages harvested through the use of an API, then you should research how that process was performed so your description will be more accurate. If you have files that were created using a specific kind of legacy software, then that context will likewise inform your description. Digitized files will also have important contextual information tied to who digitized them, when they were digitized, and what software was used, which should be included alongside descriptions of their content.
Determining Intellectual and Physical Arrangement
After conducting contextual research, you will want to decide how you will be conducting the intellectual arrangement of the materials. The content, context, and scope of the entire collection should influence the final intellectual arrangement. A difficulty you will likely encounter is how to approach the intellectual arrangement of a hybrid fonds. Some archivists choose to keep the fonds integrated, with series containing both physical and digital materials. Other archivists create separate series for different formats. These decisions should be informed by the context and nature of the collection and will change based on the functions of the materials being arranged.
It might also be necessary to perform some physical arrangement to support the context of the materials and clarify relationships between items, for example physically arranging files into series. This should be guided by your descriptions and the material’s original arrangement. Before beginning any kind of physical arrangement, check again to see if you have created a directory listing of the original file order. This listing will help you document the changes you have made, and can then be included alongside the other metadata as part of the final SIP. Should you need to re-arrange materials after transferring them to Archivematica, and prior to ingest, there is an arrangement panel that allows you to structure and create SIPs for ingest into the system (additional explanations, as well as screencast demonstrations, can be found in this section’s Resources).
Creating Appropriate Levels of Description
Your descriptions are there to guide those who wish to use the materials at a later date by giving them important contextual information. The amount of detail you use in your descriptions should be tailored to the collection you are working with, as well as the systems you are using for access. Some collections contain too large a volume of materials to realistically include descriptions at the item level, and therefore you would describe at the folder level or above. On the other hand, the method of access may require some form of description for each item made available, so that the finding aid can link to them. One example of this is how Archivematica links to AtoM, an open-source archival description and access system. Archivematica is able to upload a DIP to AtoM, which automatically uses any simple DublinCore metadata entered by the archivist for ingest with the AIP in Archivematica to include item-level descriptions with access copies.
Another thing to consider is whether you will want to include the finding aid alongside the other metadata in the transfer package to be ingested into Archivematica, or whether it will be hosted somewhere else. Including it has the benefit of keeping the method of discovery alongside the physical files. However, this also means that the finding aid would need to be updated in two different places, creating the possibility of differences in content.
A final problem facing archivists when they try to describe born-digital collections is that the common descriptive standards such as RAD and DACS do not easily lend themselves to describing digital materials, and must be manually adjusted to accommodate digital objects (a more detailed explanation of the issues with RAD can be found in the Resources section). Recently, a group of University of California archivists including Annalise Berdini, Charles Macquarie, Shira Peltzman, and Kate Tasker have created the ‘UC Guidelines for Born-Digital Archival Description’ specifically for born-digital archival description. The guidelines contain a set of elements appropriate for creating finding aids for collections containing born-digital items. This includes describing the extent of digital materials, what to include in a container list, and how to document processing steps, all of which have aspects that are unique to working with born-digital items. Yale University also has released a similar set of guidelines.
Redacting Personally Identifiable Information
Based on your appraisal processes, you will now know which materials contain personal or sensitive information that you will need to redact before it is made accessible to researchers. This information should be documented and included along with other relevant metadata so that it will be easy to redact from access copies in the future. You may also decide, based on how much time you have to process the files, your plans for access, and the anticipated demand, to redact only in response to files being requested.
The previously-mentioned UC guidelines recommend that the reasoning for removing PII, such as health or financial information, should be included in the description along with institutional policies and procedures that prompted its removal.
Deleting Duplicates/Hidden Files
As discussed in the review and appraisal section, you may have identified duplicate or hidden files that you wish to delete. Hidden files are a problem especially when working with disk images, as the process of creating a disk image creates an exact copy of the disk, including files the user may have thought were deleted. If you have identified these files and decided that they are not worth keeping, this is when you would delete them once and for all, as well as document your actions and reasoning.
Once you have deleted duplicates as needed, you will also want to weed the remaining materials in accordance with the information gathered during appraisal, the decisions made during accessioning, and your institution’s guidelines. As with deleting duplicates and hidden files, don’t forget to document this process and include the record along with the other documentation you have created.
In the process of conducting arrangement you may decide that you need to rename files. Most best practice guides for digital file naming will first instruct you to remove any special characters from file names, as they cannot be read on all systems – and it should be noted that Archivematica automatically performs this function. Spaces should also either be removed or replaced with an underscore or dash. Filenames should ideally follow a short, consistent, and comprehensible formula. When including dates, it is a good idea to arrange them by YYYYMMDD. For more guidelines on file naming, see Resources.
This blog post also addresses the new UC guidelines, this time by summarizing a workshop focusing on born-digital description which included practical demonstrations of how the guidelines could be used to describe both digital and hybrid fonds.
This blog post introduces a new set of born-digital description guidelines created through the collaboration of five California universities. These guidelines address the gaps in the existing standards, and simplify the work of describing born-digital materials. A link to the guidelines themselves can be found both within the article and in the Resources section below.
This case study explains how the University of Oregon integrated the freeware program “ReNamer” into their finding aid production workflow in order to automate the renaming of large numbers of files.
The archival arrangement tab in Archivematica can be used for the physical arrangement of the final SIP prior to ingest. See the resources below for more information.
This program allows for the bulk renaming of files, including the removal of special characters. Comes in both free and premium versions. Windows only.
Screencasts created by the Bentley that demonstrate how to use the appraisal and arrangement tab in Archivematica.
A short overview of file naming best practices from Stanford University Libraries. Also contains links to case studies related to file naming.
An in-depth look at the Bentley Historical Library’s integration of ArchivesSpace, Archivematica, and DSpace into a single workflow, and their creation of the Archivematica Appraisal Tab.
A longer and more in-depth set of suggestions for file naming, containing examples drawn from the University of Iowa.
Another explanation of the unique considerations associated with arrangement and description of born-digital objects, and how they differ from traditional archival practice.
An overview of the literature surrounding the description and cataloguing of born-digital materials in archives. Examines multiple case studies of how different archives have sought to deal with the problems of describing and giving access to born-digital personal papers.
A 2013 conference paper by Kat Timms that explains in detail the problems with trying to describe born-digital records using the Rules for Archival Description.
Guidelines for born-digital archival description that adapt descriptive elements from existing standards such as DACS and ISAD(G) to be more useful when describing digital objects, as well as addressing gaps in these elements.
A similar set of guidelines for born-digital archival records to the UC guidelines linked above.