Dataset sharing as a service (DASH) provides a simple interface for the EPFL scientific community to share research datasets associated with scientific publications.
Researchers are increasingly required to share large datasets to ensure reproducibility of their work. These datasets must remain accessible for extended periods since they are directly linked to one or more scientific publications.
The service leverages S3 protocols with S3 backends, which are specifically designed for long-term storage. Because S3 is HTTP REST API by design, datasets can be shared through a simple public URL in read-only mode.
This dataset sharing service was originally established by the EPFL School of Computer and Communication Sciences (IC) IT team for professors within that school. Due to growing demand across all EPFL schools, the Research Computing Platform (RCP) now offers this service to the entire institution. This expansion was made possible thanks to the valuable contribution of successive IC IT teams, as well as the feedback and operational support of the Chief Information Security Officer and the operational security teams of the Vice Presidency for Operations, who ensured a comprehensive security audit of the existing service.
You should use this service when you need to:
- Share large datasets linked to your scientific publications
- Ensure long-term accessibility of your research data for reproducibility purposes
- Provide public read-only access to datasets via a stable URL
- Distribute data to a designated audience without managing complex access controls
Datasets stored on the EPFL S3 Long Term Storage Service are shared through a dedicated proxy. This proxy filters requests based on the URI and redirects them to specific S3 buckets using a read-only access key and secret key.
RCP will provide you with a direct link to your dataset in read-only mode through your chosen URL.
Before requesting access, please ensure the following:
- S3 Bucket Creation: Your unit must create an S3 bucket in advance through the XaaS portal at https://portal-xaas.epfl.ch. Your proximity IT support team can assist you with this step if needed.
- Data Upload: The dataset must be uploaded to the bucket by the data owner.
- Request Read-Only Link: Once the bucket is ready and populated, RCP will configure the read-only proxy link.
Once configured, your dataset will be accessible at:
https://datasets.epfl.ch/<NAME OF YOUR DATASET>
- Provides a robust, secure, and easy way for the EPFL scientific community to share datasets when necessary
- Built on enterprise-grade S3 long-term storage infrastructure
- Simple public URL access with read-only restrictions
- Designed specifically for academic and research publication requirements
The Sharing Datasets As A Service itself is not charged.
However, the underlying S3 bucket storage is billed according to the u1 pricing model. Please refer to the official S3 service description for detailed pricing information.