An ACM project awarded by the ETH Board’s “Open Research Data” Programme
Open Research Data Portal
The CA–O–RD project (Contemporary Architecture – Open Research Data), supported by the ETH Board and led by the Archives de la construction moderne (EPFL), aims to establish a digital preservation infrastructure for architectural archives, which are increasingly born-digital. In collaboration with ENAC-IT4R, it relies on Archivematica, compliant with the OAIS model, and integrated into Morphé (AtoM), to ensure the ingestion, standardization, preservation, and dissemination of data.
Objectives
The project aims to address the ACM’s need to sustainably preserve digital archives and ensure their accessibility, within an Open Research Data perspective.
Main objectives:
- Ensure long-term preservation of born-digital and digitized archives
- Provide open access to disseminable content via Morphé (AtoM)
- Systematically publish metadata, even when files are not accessible
Scope:
- Approximately 1 TB of data (more than 300,000 files)
- Several representative collections across diverse media
Source Data
The project is based on a selected corpus of digital archives.
Corpus characteristics:
- Approximately 1 TB of data (more than 300,000 files)
- Wide variety of formats: CAD (.dwg, .dxf), images, PDFs, office files, compressed formats
- Diverse storage media: servers, external drives, CD-R/DVD
Specific challenges:
- Complexity and heterogeneity of files
- Preservation challenges for CAD formats, often proprietary and difficult to migrate
Methodology
The project relies on an ingestion and dissemination strategy that balances Open Research Data (ORD) principles with legal constraints.
Dissemination criteria:
- Absence of sensitive data
- Clear attribution to the producer
- No third-party restrictions
Workflow :
- Selection and analysis of folders
- Preparation of metadata and structure
- Ingestion into Archivematica
- Publication via AtoM (or metadata only)
IT Tools
Archivematica : An open-source digital preservation system based on international standards, notably ISO 14721:2012 (OAIS), ensuring long-term access to digital archives.
How it works :
- Ingestion and validation of SIPs (Submission Information Packages) submitted by producers
- Generation of AIPs (Archival Information Packages) through conversion, normalization, and enrichment processes
- Creation of DIPs (Dissemination Information Packages) for dissemination via the AtoM platform
FAIR Data
The ACM seeks to strengthen its preservation capacity and ensure access to data in accordance with FAIR principles (Findable, Accessible, Interoperable, Reusable):
- Findable : each record is assigned a stable identifier and a persistent link
- Accessible : descriptions are available online without access barriers
- Interoperable : metadata follows archival standards (ISAD(G) etc.) and can be exported or cross-referenced
- Reusable : data is released under an open license, accompanied by context and processing history.
Data for Research
These previously unpublished archives open new research perspectives. Their management requires specific actions to ensure access and document authenticity.
Principles :
- Preservation of bitstreams to ensure integrity and reuse
- Migration to sustainable formats when possible
- Extraction of technical and contextual metadata
- Production of dissemination copies in suitable formats
