Storage for your research data

Determining what facilities and equipment will be needed for data storage is an important component of planning for data management. It is useful to think about file storage solutions through two distinct stages of your research project: first during the active stage, and then for the long term storage of the final outputs. 

Active data storage

During the active stage of your research project, the data collected will likely be regularly modified and possibly accessed by more than one research team member. At this stage of a project, data storage and sharing between team members should allow for several points of collection, and for work on cleaning, transformation, and analysis of files. In some cases, files may need to be transported to and from cloud-based research analysis software. This ‘active data storage’ is typically necessary as a short-term solution, only used during the collection and analysis phases of your project. 

Some active data storage options affiliated with SFU include:

  • Microsoft OneDrive at SFU
    • for individuals at SFU
    • currently limited to 1TB
    • data transfer via browser or desktop integration
  • SharePoint storage associated with Microsoft Teams at SFU 
    • only staff and faculty can request a Team to be set up
    • storage currently limited to 25TB in total
  • SFU Cloud
    • hosted in the SFU Data Centre
    • currently available at cost and in blocks of 1TB
    • data transfer via Globus
  • Digital Research Alliance of Canada Nextcloud storage
    • free for all Canadian researchers
    • limited to 100GB
    • data transfer via browser or desktop integration
  • Digital Research Alliance of Canada Arbutus cloud
    • bare metal servers useful for data intensive projects
    • free for Canadian researchers
    • data transfer via Globus
  • Your department or lab might also provide storage servers (e.g., managed by departmental IT)

External commercial cloud storage or other storage options (such as DropBox, Google Drive, AWS, and so on) also exist; however, care must be taken when considering their use and there are guidelines for what types of data could be stored on such services. For more information on ethics considerations contact Research Ethics, and for legal considerations contact Research Services.   

Long-term data storage

Researchers invest a great deal of time, effort, and resources in collecting data to support their projects. This investment means research data has significant value in addition to its intrinsic value as a record of human knowledge. You or one of your collaborators may want to use data again in the future, and funding agencies may have requirements for whether data is to be saved at the end of a project. Before you begin data collection, it is important to determine what files you will keep, where, and for how long. In some cases the research ethics process stipulates that certain data needs to be destroyed within a given time frame. Here are some considerations on what to store for the long term:

Once you have selected the materials to be saved, try to use stable and accessible file formats for long-term storage

Research projects sometimes use active data storage infrastructure for long-term storage as well. Such storage, whether locally supported (e.g., SFU OneDrive) or a commercial option (e.g., DropBox, or Google Drive), might be adequate for personal use but are typically not dependable solutions for long-term data discovery or stable access. Instead, data can be published online either in institutional repositories or in trusted external data archives and subject-specific repositories. A stable and established data repository will manage all aspects of long term data storage for you.

If you are a student at SFU and your research data files support your thesis, consider depositing your data in Summit, SFU's institutional research repository. In other cases, researchers can publish their data in the SFU Research Data Collection in the Canadian Federated Research Data Repository (FRDR). 

If publishing your data in a repository is not an option, consider using nearline storage for inactive data available from the Digital Research Alliance of Canada.