Preservation Service Comparison

This chart provides a comparison between three levels of digital preservation options offered by Scholars Portal: the OLRC storage network by itself, DuraCloud, and Permafrost. With each option comes an increasing degree of preservation functionality. The sections below identify the key features of each service, example use cases that best match these features, and a detailed comparison of key functions.

Service descriptions

Ontario Library Research Cloud (OLRC)

DuraCloud
backed by the OLRC

Permafrost
Archivematica + DuraCloud & OLRC

Flexible, available, and reliable base-level storage infrastructure for preservation and access An open-source application designed for bit-level preservation management across storage systems A standards-based workflow for preparing and packaging files for long-term preservation and access

Key features and example use cases

Description Ontario Library Research Cloud (OLRC) DuraCloud Permafrost
Key features
  • Based on OpenStack Swift object storage software
  • All files are replicated 3 times across 5 geographically distributed locations
  • Files are checked internally for integrity and if a replicated copy is is found to be
    corrupted, it is replaced with a good version
  • Horizon user interface for easy upload, file management, and user management
  • Public access options, including streaming audiovisual content
  • Secured using private ORION network, IP restrictions, connections using SSL, and data is
    encrypted at rest
All features of the OLRC, plus:

  • Regular, automated independent integrity checks
  • Option of synchronization of additional copies with separate storage providers
  • Sync Tool for automated ingest
All features of the OLRC and DuraCloud, plus:

  • Creates well-formed packages of preservation files and metadata designed for long-term
    storage
  • Designed based on Open Archival Information System standard (OAIS)
  • Advanced technical metadata extraction and structuring in METS/PREMIS format
  • Ingest and structuring of descriptive and rights metadata
  • Automated file format conversion for preservation and access derivatives
Example use cases
  • Flexible storage for more frequently updated or accessed digital collections and systems
    backups
  • Create public access URLs for used in repositories, exhibits and more
  • Static website hosting
  • Audiovisual content streaming
  • Integrations with S3-speaking applications
  • Preservation storage for less frequently updated or accessed digital collections
  • Digitized collections, especially when access derivatives and metadata are created and
    managed elsewhere
  • Digital repository contents (including via DSpace integration)
  • Web archiving outputs (including via Archive-It integration)
  • Preservation storage for less frequently updated or accessed digital collections, especially
    where preservation management based on technical metadata like file formats important
  • Born-digital archival records from individuals and organizations
  • Digitized collections, especially where the creation of access derivatives is required
  • Research data
  • GIS and map data
  • Integrations and workflows with AtoM

Detailed functional comparison

Function Ontario Library Research Cloud (OLRC) DuraCloud Permafrost
File organization
  • Projects: A project is usually organized around a functional grouping of data files, such as “backups” or files belonging to a specific unit in the organization. Users are assigned at the project level
  • Containers: sub-groupings within a project; containers hold individual files and folders
  • Spaces: hold groupings of files uploaded to DuraCloud. DuraCloud does not use folder organization beyond this, but will preserve paths when folders and their contents are are uploaded
  • AIPs: Archival Information Packages for preservation in OLRC via a space in DuraCloud
  • DIPs: Dissemination Information Packages for access in OLRC via a container in Horizon
Fixity checking
  • Incoming: When using the Swift CLI, local checksums are compared with the result from Swift; checksums can be sent using the Swift API with an ETag header. Horizon reports on the checksum identified as received by OpenStack Swift but does not make a comparison with any locally-created checksum value
  • Ongoing: Internal integrity checking within the storage network
  • Incoming: Local checksums are calculated when using the Sync Tool or DuraCloud API, but not when using the DuraCloud interface
  • Ongoing: At least twice-yearly health checks perform an automatic verification of fixity against DuraCloud’s stored database checksum value
  • Incoming: Ingest of packages in BagIt format or files with checksums
  • Ongoing: AIP integrity is verified using DuraCloud health checks; additional fixity verification is also available via the Archivematica Storage Service check fixity API call
User-created metadata ingest options
  • Metadata field:value pairs can be added via the CLI only
  • Tags and field:value pairs can be entered using the interface and DuraCloud API but are not searchable using the interface
  • Detailed descriptive metadata (Dublin Core and custom formats) and rights metadata (PREMIS format) can be added via the interface and using CSV files
Automated metadata extraction
  • MIME type
  • File size
  • Date uploaded or modified
  • MIME type
  • File size
  • Date uploaded or modified
Advanced technical metadata extraction:

  • Signature-based file format identification
  • Technical characterization for image and audiovisual formats
  • File format validation
  • File size
  • File last modification date
File format conversion
  • None
  • None
  • File format policy registry (FPR) containing variety of tools (including ImageMagick, FFMPEG, Ghostscript) for generating derivatives for preservation and access
Search and indexing via interfaces
  • Horizon: Filename search but not recursive into folders
  • DuraCloud: File name prefix search
  • Archivematica: AIP METS files fully indexed and searchable
Public access
  • Can set public access containers and create links to objects
  • Streaming audiovisual content
  • Can set public access containers and create links to objects
  • File format conversion for access derivatives
  • DIP deposit to OLRC via Horizon with same features available
API access
  • Swift API
  • S3 API
  • DuraCloud API
  • Read-only access with Swift/S3 APIs
  • Archivematica Storage Service API
  • Dashboard API
Available integrations
  • ArchiveIt
  • DSpace
  • Incoming: Dataverse, DSpace
  • DIP deposit: AtoM, ArchivesSpace
Data deletion protection
  • Read-only user profiles
  • Containers in Horizon can be set as ‘protected’ to prevent overwriting or deleting contents
  • Read-only user profiles
  • Access to AIPs in DuraCloud is read-only
  • Two-step AIP deletion process
  • Containers in Horizon can be set as ‘protected’
Key limitations
  • Because replicated copies across nodes are not independent, all copies are deleted if an admin or member user directs this action. Using DuraCloud enables creating and managing independent copies in additional storage locations
  • Unless the command-line interface is used, local checksums are not calculated and compared on upload
  • Limited file management capability via DuraCloud interface
  • Unless the Sync Tool or API is used, local checksums are not calculated and compared on upload
Processing capacity is limited to service hardware levels per package processed:

  • Level 1: 50GB or 3,000 files
  • Level 2: 250GB or 10,000 files