RDM Q&A

How can I make my research data publicly available as academic output?

You can do it in many ways (see here), but to make it align with most funders policies and make it discoverable and citable by others, EPFL recommends using Zenodo. This is an open-access repository for publishing research data and code as scholarly products.

How to submit my dataset on Zenodo?

Submitting a dataset is easy: log in to your Zenodo account; start a new upload and select the EPFL community or directly go to the EPFL community upload; upload your files; fill in the required fields; and choose an appropriate license.

How can I ensure my dataset meets good publication practices?

By following EPFL’s guidelines, you will meet good publication practices. You will need to provide context by including a README file (here is a README template), detailed metadata about the dataset, and an appropriate license.

What about code?

Publishing code with Zenodo works the same way, but if you use GitHub, it’s even easier: follow this easy GitHub documentation. Using Zenodo to publish your code will make it citable, adhering to funders’ guidelines, as GitHub itself or GitLab or similar are not considered proper publishing platforms.

Why publish my dataset on the Zenodo EPFL community?

By associating your work with the Zenodo EPFL community you will gain the benefits of increased visibility and publishing curation. Once submitted, the RDM Library Team will provide proper and rapid curation, help you meet some simple criteria of good publication practices, and make the most out of your work via added services.

What are the added services to publishing on the Zenodo EPFL community?

In addition to the curation, accepted datasets will be conveniently listed on Infoscience, for added visibility and easy reporting; moreover, if the dataset is related to a publication and its license allows it, it will be archived long-term with ACOUA, EPFL’s long-term preservation service.

What is the benefit of having my dataset listed on Infoscience?

Listing your dataset on Infoscience increases its visibility and makes it easier for others to find and cite your work. It also helps in reporting and ensures your dataset is preserved long-term if it is related to a publication.

Can I update my dataset after publishing on Zenodo?

Yes. Modifying individual files is possible, but only for a limited time after publication; usually, a new version can be created with the modified dataset. Each new version receives its own DOI, and all versions of a dataset are issued a global DOI that points to the latest version.

Can I manage the accessibility of my dataset on Zenodo?

You can edit the accessibility of a dataset by simply switching it from “Restricted” to “Public”. We encourage you to do the opposite only when mandatory or strictly necessary: restricting the access to a previously open dataset hinders its compliance with major research funders and EPFL guidelines.

How do I grant access to specific users?

If you want to have your work in “Restricted” access, but allow potentially interested users to ask you for permission, use the related option on the Sharing tab.

Any alternatives to Zenodo for my data or code publication?

You can use other platforms such as data journals, institutional repositories, or other data repositories. The RDM Library Team created a comparative table of data dissemination platforms, based on EPFL researchers’ DMPs and survey results. Contact [email protected] if you want guidance deciding the best data publication strategy for you.

What about journals for publishing data or code?

You can find open access with no APC (Diamond Open Access) and payment of APC (Golden Open Access) journals for data & code. Compared to data repositories, journals demand an accompanying text, which are usually peer-reviewed along the dataset. Each of these solutions has specific guidelines for submission and can provide different visibility for your work. Contact [email protected] or use this Data Policy Finder to know more about individual data policies.

What is a data Management Plan (DMP)?

A DMP is a document that outlines how data and code will be handled during and after a research project, e.g.: collection, documentation, storage, software, sharing, publication, archiving. SNSF, Horizon Europe, and EPFL for internal project, just to name a few, require a DMP to obtain a grant and ensure alignment with security, ethics, Open Science, and other guidelines. See more here about Data planning and guidelines.

What are FAIR principles?

FAIR principles ensure that a dataset is Findable, Accessible, Interoperable, and Reusable. While these principles guide the publication of data and code as academic outputs, they regulate their planning and implementation in the research workflows, to enhance their quality and reproducibility even before their publication.

How do DMPs and FAIR Principles relate?

It is usually required that management of data and code is planned according to FAIR data principles. As SNSF also emphasizes, DMPs are not just administrative tasks but crucial tools that enhance research transparency, reproducibility, and impact by ensuring data quality and reach of research outcomes, and the FAIR principles are at its core.

Where can I find guidance on writing a DMP?

The EPFL RDM Library Team provides DMP templates (ex. EPFL DMP templateSNSF DMP template) and guidelines in creating DMPs that meet funder requirements. As most submissions by EPFL research community concerns SNSF, the team also created a step-by-step guide to write your SNSF DMP.

What if I need specific guidance on my DMP?

The EPFL RDM Library Team is available to provide personalized support and feedback on DMP drafts, within a maximum of two working days to get a review. For any question on DMPs, to get a review, or related questions, contact [email protected].

Can I modify my DMP after the project has started?

Yes, DMPs are considered living documents and should be updated regularly to reflect changes in the project. This ensures that data management practices remain relevant and effective throughout the research lifecycle. The ERC also treats the DMP  as a project’s deliverable and publishes its last version when the project ends.

How does the DMP changes if my research involves personal or sensitive data?

While the core elements of a DMP are similar, specific requirements vary depending on the funder, discipline, countries involved into the data collection and processing, and type of data involved. But all major DMP templates integrate a section related to the treatment of personal or sensitive data.

What are planning considerations in case of personal or sensitive data?

For projects involving sensitive data, one needs to plan more detailed information on storage solutions, better specify responsibilities, think ahead of anonymization methods, and minimization of risks. The information of who will have access to what data, and on which storage, is crucial during planning.

In the DMP, carefully describe the extent of collaborations and be aware that if you gather only data that has been already anonymized, then your project is not considered to involve personal nor sensitive data. 

What do I need for an ethic review and who can I contact?

EPFL DMP template is required by HREC for ethics reviews: it contains a specialized annex for data protection, helpful to both plan your data management and to inform the review process. For specific questions about personal or sensitive data, you may contact the Research Office team at [email protected].

How to make a DMP actionable for my project?

Being a living document, the DMP is useful if updated all along the project. It allows new members to jump into it much faster, and you’ll also find information easily after its end.

  • Use itemized lists, to keep the text short, and cite them throughout the DMP
  • Don’t focus only on raw data, but list also processed data, and code
  • Avoid redundant file formats, preferring open, non-proprietary ones
  • Name all tools, software, and platforms
  • List all storage solutions, local and online, for sharing, processing, archiving data, etc.
  • Insert as many useful links as possible, so that your DMP becomes a dashboard of sort
  • Keep the DMP shared with all project team members, ask for suggestions
  • Name responsible people, clarify their roles, and make them read the DMP!
  • Describe a naming convention for files and folders
  • Precise your dissemination strategy, what data repository and what licenses
  • Tell what to write into README files or other documentation for your dataset
  • Be transparent (not verbose) about what you cannot yet decide
  • Gradually update the DMP from a plan to a description of your data workflow
  • Publish or archive your DMP along your dataset

What are simple steps to prepare my research data from long-term preservation?

Make sure to select the data and code you want to preserve, clean them if necessary, document them (ex. with a README file, the DMP, protocols, etc.), and select an archiving system that allows retrieval in the long term.

What about file formats?

If you used open, non-proprietary file formats, future reuse is simpler: although it is better to use them from the beginning, you can also convert your proprietary files into open ones before archiving both.

What archiving system should I use?

EPFL offers the ACOUA (Academic Output Archive) service for long-term preservation of research data: it is suitable for data that needs to be securely archived, but not necessarily openly shared.

On ACOUA, archived data are preserved and managed, but not publicly visible or searchable. Access is possible only by authorized request, typically by the original research team or EPFL services responsible for verification or compliance.

You can find other archiving systems in this comparative table by filtering type by “Archive”.

When is archiving more appropriate than publishing?

By publishing data or code, you grant other researchers the access to your work, intended as an academic output resulting from your research. Archiving does not correspond to publishing: you can archive also negative or incomplete results, or other interesting work even if not publishable right away. In principle, all that’s useful for potential auditing, for future projects, and as a complement to published results, needs archiving.

Why should I archive my data with ACOUA?

Archiving your research data (including code) ensures it remains accessible and usable to you, or your collaborators, beyond the lifespan of individual projects or storage solutions. It helps meet funders’ requirements for data retention, plus EPFL rules (LEX 3.3.2, Compliance Guide, FAQ End of Thesis). Keep in mind that EPFL NAS, faculty servers, or EPFL webpages are not fit solutions for long-term preservation.

What services are offered with ACOUA?

  • Archive the entire dataset underlying your publication (incl. unpublished results)
  • Archive datasets of a finished research project
  • Archive datasets of a collaborator leaving EPFL
  • Archive datasets of a laboratory or research group before its closure
  • Free your faculty server quota for large datasets that need preservation
  • Preserve reference raw data useful during your present or future research
  • Get expert support for data curation prior to archiving
  • Get periodic data integrity audits in the years to come
  • Get reference on Infoscience (metadata only)

Does ACOUA assign a persistent identifier (such as a DOI)?

A persistent identifier is assigned that makes your dataset citable (not directly downloadable) via Infoscience, where you can choose the retention period, and access rights.

How much does ACOUA cost?

ACOUA is free and allows the secure archiving of datasets of up to 10 TB each.

How does ACOUA handle data security?

ACOUA ensures data security through periodic integrity audits, secure access controls, and compliance with EPFL’s data retention policies. This guarantees that your data remains safe and intact over time. Nonetheless, for legal reasons, personal or sensitive data are not admitted, unless they are already encrypted and its archiving is authorized by the HREC. Anonymized data are not considered personal nor sensitive data.

How do I proceed to archive in ACOUA?

Contact [email protected] or [email protected] to start archiving. If you need to archive personal or sensitive data, contact also the EPFL Human Research Ethics Committee (HREC) at  [email protected].

How can I choose the right tools for managing my research data workflows?

When selecting data management tools (software, storage solution, online platform), do consider: improvements over existing workflow; ease of use; support of exporting in standard formats; and legal compliance. The tools should enable efficient data processing, analysis and sharing while ensuring data security and interoperability.

What data management tools for Data management and processing?

You can find a list of useful RDM Tools on our dedicated webpage. Here are some of them:

  • Electronic Lab Notebooks: like EPFL ELN, SLIMS, or openBIS
  • Data processing environments: such as Jupyter Notebooks, for ex. with NOTO or RENKU at EPFL; or commercial solutions like MATLAB.
  • For code development: you can also opt for c4science or EPFL Gitlab.
  • For surveying studies: REDCap can be used to create online surveys while ensuring compliance with EPFL policies. Other surveying platforms are Findmind and SurveyHero, both based in Switzerland.
  • For numerical simulations: the LSMS laboratory offers Akantu as an open-source solution for Finite Elements simulations; other solutions might be LAMMPS or QESPRESSO.

What sharing tools and platforms?

For sharing data during the project, different infrastructures are supported by EPFL,  including SWITCH Drive, Microsoft 365, GitLab EPFL, RCP NAS, and of course the EPFL Servers, which offer secure collaboration environments for different use-cases.

What back-up tools?

For back-ups, EPFL provides institutional solutions such as Atempo Lina for automatic backups, while ACOUA is indicated to archive datasets for secure long-term preservation.

Where can I get advice for optimizing my data workflow?

You can also consult with the RDM Library Team for advice on suitable tools for your needs, and to help you streamline your process, to eliminate redundancy and shorten the path to data/code publication.

Electronic Lab Notebooks (ELNs), why and which ones?

ELNs are digital versions of traditional lab notebooks, designed to manage and document research data, protocols, and workflows. While serving a wide range of needs, ELNs are usually designed for domains such as chemistry, material science, and life sciences. At EPFL, we suggest EPFL ELN, SLIMS, or openBIS, but you can have a larger choice with this Electronic Lab Notebook Comparison Matrix. Check also our Fast Guide #7.

How can ELNs benefit my research?

  • Help implementing FAIR Principles by streamlining data organization
  • Allow PIs or Data managers have a better oversight on projects and better access management
  • Share documentation and data with collaborators in an organized way
  • Eliminate problems associated with bad handwriting or corrections
  • Search information and data, preventing its loss or re-doing
  • Improve reproducibility of experiments
  • Helps with data re-usability
  • Simplify data publishing, with export of projects to be uploaded on a data repository.

What is licensing?

A license is an official document that grants others permission to use, modify, or distribute something for which you hold the copyright, such as code you have written or data you have collected.

Why is licensing important?

Without a license, the use of your materials remains ambiguous. It makes collaboration easier, and it clarifies the modality of reusability, sharing and publishing of your work.

What are available licenses for data?

There are multiple licenses available for data, each specifying conditions under which distribution, remixing, adaptation, and building upon the data are allowed. The common ones are Creative Commons (CC) licenses that are pre-formulated basic licenses for works protected by copyright. They can be used for texts, photographs, images, films, drawings, music, files for 3D printers, etc. Here is a list from most open to least open:

  • CC0 (Public Domain Dedication): No restrictions on use.
  • CC BY (Attribution): Credit must be given to the creator; allows commercial use.
  • CC BY-SA (Attribution-ShareAlike): Credit must be given to the creator; any changes must be shared under the same license; allows commercial use.
  • CC BY-NC (Attribution-NonCommercial): Credit must be given to the creator; only noncommercial uses are permitted.
  • CC BY-NC-SA (Attribution-NonCommercial-ShareAlike): Credit must be given to the creator; any changes must be shared under the same license; only noncommercial uses are permitted.
  • CC BY-ND (Attribution-NoDerivs): Credit must be given to the creator; no derivatives or adaptations are permitted; allows commercial use.
  • CC BY-NC-ND (Attribution-NonCommercial-NoDerivs): Credit must be given to the creator; no derivatives or adaptations are permitted; only noncommercial uses are permitted.

What are available licenses for code?

There are various licenses available for code, each with specific terms regarding use, modification, distribution, and warranty. Here are some common licenses:

  • MIT: Allows unrestricted use, distribution, and modification of the code. No warranty provided.
  • APACHE: Permissive license allowing use, modification, and distribution of the code, including patent rights. No warranty provided.
  • BSD: Similar to the MIT License; it is permissive, allowing use, modification, and distribution of code.
  • GPL (General Public License): Copyleft license requiring derived works to be licensed under the same terms. Mixing with incompatible licenses is not allowed. Permits patent use and requires distributed derivatives to be open source.
  • LGPL (Lesser General Public License): More permissive version of the GPL for software libraries. Allows linking with non-(L)GPL software, ensuring modifications to LGPL parts remain open source.
  • AGPL (Affero General Public License): Strong copyleft license like GPL, with an additional requirement to provide source code to network users. Permits patent use and requires network-delivered derivatives to be open source.

How do I know which license to use?

If you don’t know which license to use there are many tools online that will help you do so. CreativeCommons developed a tool to help you choose a license for your data. Concerning code, you can refer to this open source tool.

I am still not sure on which license to use?

If you are still unsure on which license to use, please feel free to contact the Technology Transfer Office (TTO).

How do I apply a license to my work?

To apply a license to your work, you should include a clear statement of the chosen license in your documentation or within the work itself:

  • For code, you can add a LICENSE file in your repository with the full text of the license.
  • For data or other content, include a statement (e.g.: “This work is licensed under the license name”) along with a link to the full license text. If you are using Creative Commons licenses, you can also use their provided badges and legal code.

Can I change the license of my work after it has been published?

Yes, you can change the license of your work after it has been published, however, this change will only apply to future distributions of your work. Anyone who has obtained the work under the previous license can still use it under those terms. To change the license, update the licensing information in your documentation and inform users of the change.

Can I license only parts of my code or data under different licenses?

Yes, you can license different parts of your code or data under different licenses. This practice allows incorporation of third-party components with different licensing terms. Clearly specify the applicable license for each part in your documentation.

Is licensing harmonized internationally?

The protection of data by law is not harmonized internationally but varies depending on the specific country. Licenses do not all have the same international recognition.

Contact

[email protected]


+41 21 693 21 56


Access map