Scholarly Publishing and Open Access blog

The latest news and answers to your questions about scholarly publishing and open access.


Why should I preserve my data?

Published by Alison Moore

Making a long-term plan for research data

This blog post was written by Kelsey Poloney, Data Management and Systems Archivist.

What counts as research data?

Research data is any material that you collect, observe, or create in the course of research and which supports your findings. Research data is more than just numeric data in spreadsheets; it can also include photographs, film, interview transcripts, and more. Essentially, it is any material that you would need to validate your findings to someone questioning them, or those materials that someone would require in order to replicate your findings.

Why preserve research data?

As a researcher, you likely have invested a significant amount of time and effort into your research project. That time and effort makes your data valuable, and you don’t want to see it lost or forgotten once your research is complete. Other factors like professional standards of reusability and access and grant requirements may also be motivations to preserve data.

Data preservation is the key to ensuring that your work remains accessible to you and your professional community. You or other researchers may want to reuse your research data for future studies, and planning for preservation means that the work you put into it won’t need to be re-done. 

How do I preserve my data?

Data preservation encompasses more than simply keeping data saved on Dropbox or your personal computer. Ideally, you should consider the preservation of your research data at every step of the process, and plan for the long-term storage and access of all of your related records. Creating a data management plan can help you to organize and identify your data’s preservation needs at the start of your research.

Try to use open source, platform-independent file formats when possible to avoid obsolescence, for example, use comma-separated values (.csv) rather than Excel spreadsheets (.xlsx) for tabular data, and use rich text format (.rtf) or plain text (.txt) for textual data. Keep your documentation (such as definitions of variables and abbreviations, code or software used, and access restrictions), and backups up to date throughout the project so you don’t forget what data you have. Conduct regular audits to be sure that you haven’t lost any information along the way.

At the conclusion of your project, decide what information will be preserved long-term and where you will store it. Consider the following questions when selecting what you will preserve:

  • Is the research data directly connected to published research?
  • Will my data be useful to other researchers and future projects?
  • Does my funder require data sharing?
  • How sensitive is my research data? Are there any privacy concerns? 
  • Do I have consent from research participants to share their data?

Data repositories are often the best choice for depositing your data for preservation and access.  A trusted digital repository will manage the specifics of digital preservation for you. Most digital repositories have certain requirements for the amount and quality of your documentation, data cleanliness, and file formats. Choosing a data repository for long-term storage can also make it easier for others to find and access your work. At SFU, we recommend depositing research data in either Summit or FRDR, both of which provide managed digital preservation and are searchable platforms that make your data easily discoverable.

Additional information

If you need guidance on creating a data management plan, choosing a data repository, or have any other questions about research data management, contact the Data Services team at data-services@sfu.ca. More information, including workshop schedules and recorded presentations, can be found at https://www.lib.sfu.ca/data