You are here

Versioning your research data

 

March 20, 2020 update: While the physical SFU Libraries are temporarily closed due to COVID-19 measures, we are happy to continue to support you with any research data management questions remotely. Please feel free to contact us by email at data-services@sfu.ca during this time, and we can set up a meeting via video conferencing or telephone.

 

Many copies and versions of research data files are a common occurrence in research projects. A common way to distinguish different versions of data files from one another is by using a consistent method of naming the file versions. To manage this automatically, dedicated file versioning software could also be used. The primary goal of versioning is to keep raw data organized during the collection phase, and distinct from cleaned and transformed data. 

In addition to keeping track of file versions, you should also structure your research data files by organizing and naming them in a consistent way.

Best practices for file version management

Although these best practices do exist, it is recommended that you and your research team decide what methods work in your research context.

Avoid descriptive version labels

Avoid using descriptive labels for versions (final, draft, revision, etc.), which can make it difficult to interpret file version chronology.

Use ordinal numbers

Consider using ordinal numbers (i.e., 1, 2, 3, etc.) to identify significant version changes (in practice, this might look like datafile1, datafile2).

Apply versioning methodology

Consistently apply versioning methodology, even for less significant changes.

Use decimal points

Decimal points may be used to denote less significant changes (e.g., marking smaller changes by naming successive versions as datafile1.1, datafile1.2).

Use software that automates versioning

Where available, use software or services to make the versioning process automated so that you don't have to think about the file naming conventions described above:

Wikis have a page history feature and GoogleDocs have a version history feature, allowing users to easily restore previous versions of their documents.

The Open Science Framework (OSF) online research project management system has file versioning functionality tailored for research data use.

The free and open-source Git software system can be installed on your local computer independently of network access, and provides complete version-tracking abilities for files.

Additional resources

Version control and authenticity - UK Data Service
Information about version control and data file authenticity best practices.