Preservation Service Comparison

Scholars Portal offers three levels of digital storage and preservation: the OLRC storage network, DuraCloud, and Permafrost. Each option provides an increasing degree of preservation functionality. The charts below identify the key features of each service, example use cases that best match these features, and a detailed comparison of key functions.

Service Descriptions

Ontario Library Research Cloud logo.

Ontario Library Research Cloud (OLRC)

DuraCloud logo.

DuraCloud
backed by the OLRC

Permafrost logo.

Permafrost
Archivematica + DuraCloud & OLRC

Flexible, available, and reliable base-level storage infrastructure for preservation and access An open-source application designed for bit-level preservation management across storage systems A standards-based workflow for preparing and packaging files for long-term preservation and access

Key Features and Example Use Cases

Description Ontario Library Research Cloud (OLRC) DuraCloud Permafrost
Key features
  • Based on OpenStack Swift object storage software
  • All files are replicated 3 times across 5 geographically distributed locations
  • Files are checked internally for integrity. If corruption is found in any of the 3 copies, the flagged file is replaced with a good version
  • Horizon user interface for easy upload, file management, and user management
  • Public access options, including streaming for audiovisual content
  • Secured using private ORION network, IP restrictions, connections using SSL, and data encryption at rest and in transit
All features of the OLRC, plus:

  • Regular, automated independent integrity checks
  • Option to synchronize independent copies of selected data with separate storage providers (not provisioned by Scholars Portal)
  • Sync Tool for automated ingest and Retrieval Tool for bulk downloads
All features of the OLRC and DuraCloud, plus:

  • Creation of well-formed packages of preservation files and metadata designed for long-term storage
  • Advanced technical metadata extraction and structuring in METS/PREMIS format
  • Ingest and structuring of descriptive and rights metadata
  • Automated file format conversion for preservation and access derivatives
  • Designed based on Open Archival Information System standard (OAIS)
Example use cases
  • Flexible storage for more frequently updated or accessed digital collections and data backups
  • Public access URLs for use in repositories, exhibits, and more
  • Static site hosting
  • Audiovisual content streaming
  • Integrations with S3-speaking applications, such as Archivematica
  • Preservation storage for less frequently updated or accessed digital collections
  • Management of master files, especially when access derivatives and metadata are created and managed elsewhere
  • Digital repository contents (including via DSpace integration)
  • Web archiving outputs (including via Archive-It integration)
  • Preservation storage for less frequently updated or accessed digital collections, especially where preservation management based on technical metadata like file formats is important
  • Born-digital archival records from individuals and organizations
  • Digitized collections, especially where the creation of access derivatives is required
  • Research data (including via Dataverse integration)
  • GIS and map data
  • Integrations and workflows with AtoM

Detailed Functional Comparison

Function Ontario Library Research Cloud (OLRC) DuraCloud Permafrost
File organization
  • Projects: a functional grouping of data files, such as “backups” or files belonging to a specific unit in the organization
  • Containers: sub-groupings within a project holding individual files and pseudo folders
  • Spaces: groupings of files uploaded to DuraCloud. DuraCloud does not use folder organization, but will preserve paths when folders and their contents are uploaded
  • AIPs: Archival Information Packages for preservation in OLRC via a space in DuraCloud
  • DIPs: Dissemination Information Packages for access in OLRC via containers in Horizon
Fixity checking
  • At time of ingest: when using the Swift CLI, local checksums are compared with the result from Swift; checksums can be sent using the Swift API with an ETag header. Horizon reports on the checksum received by OpenStack Swift but does not make a comparison with any locally-created checksum value
  • Ongoing: automatic internal integrity checking within the storage network
  • At time of ingest: local checksums are calculated when using the Sync Tool or DuraCloud API, but not when content is uploaded using the DuraCloud interface
  • Ongoing: regular health checks involve an automatic verification of fixity against DuraCloud’s stored database checksum value
  • At time of ingest: Archivematica generates checksums when processing begins. Checksums provided in packages are also compared with the Archivematica-generated checksum
  • Prior to storage: checksums of original objects in the AIP are compared with the checksums recorded at the time of ingest
  • Ongoing: AIP integrity is verified using DuraCloud health checks; additional fixity verification is also available via the Archivematica Storage Service check fixity API call
User-created metadata ingest options
  • Metadata field:value pairs can be added via the CLI only. Metadata does not have to adhere to a particular standard
  • Tags and field:value pairs can be entered using the interface and DuraCloud API, but cannot be used to search for files
  • Descriptive (Dublin Core and other standards) and rights metadata (PREMIS) can be added via the interface or imported using CSV, JSON, or XML files
Automated metadata extraction
  • MIME type
  • File size, according to binary system
  • Date uploaded
  • Date last modified
  • MD5 checksum
  • MIME type
  • File size, according to decimal system
  • Date uploaded or modified
  • MD5 checksum
Advanced technical metadata extraction:

  • Signature-based file format identification
  • Technical characterization for image and audiovisual formats
  • File format validation
  • File size, according to binary system
  • File last modified date
  • SHA-256 checksum
File format conversion
  • None
  • None
  • File format policy registry (FPR) containing commands for a variety of tools (including ImageMagick, FFMPEG, Ghostscript) to generate derivatives for preservation and access
Search and indexing via interfaces
  • File name search in Horizon. Search is not recursive into folders
  • File name prefix search
  • AIP METS records are fully indexed and searchable at the file level in Archivematica
Public access
  • Option to set containers as public access and create links to individual objects
  • Streaming of audiovisual content
  • Option to set spaces as public access and share links to individual objects
  • Streaming of audiovisual content
  • File format conversion for access derivatives
  • DIP deposit to OLRC via Horizon (and AtoM with integration)
API access
  • Archivematica API
Available integrations
  • Ingest: Dataverse, DSpace
  • DIP deposit: AtoM, ArchivesSpace
Data deletion protection
  • Read-only user profiles
  • Containers in Horizon can be set as protected to prevent overwriting or deleting contents
  • Spaces can be set to read-only
  • Access to AIPs in DuraCloud is read-only
  • Two-step AIP deletion process
  • DIP and Access containers in Horizon set as protected
Key limitations
  • Because replicated copies across nodes are not independent, all copies are deleted if an admin or member user directs this action. Using DuraCloud enables creating and managing independent copies in additional storage locations
  • Local checksums are only calculated and compared on upload if the CLI is used
  • Limited file management capability via DuraCloud interface
  • Local checksums are only calculated and compared on upload if the Sync Tool or API are used
Processing capacity is limited to service hardware levels per package processed:

  • Level 1: 125GB or 3,000 files
  • Level 2: 250GB or 10,000 files