Data with principles: How to be FAIR, and why you should CARE
This blog post was written by Joe Wright, SFU Library co-op student
The unstoppable Open Access movement is well upon us, making research accessible by removing barriers and increasing shareability. As research publications become more accessible, it raises the question of where this leaves research data.
Why should research data be made accessible, and how does this concern private or protected data? How can you make your data accessible properly, and ethically? Thankfully, there are two essential guiding principles for responsible and inclusive data management to answer these concerns: the FAIR and CARE Data Principles.
Figure 1
Depiction of the FAIR and CARE principles
Note. From CARE principles for Indigenous data governance, by Research Data Alliance International Indigenous Data Sovereignty Interest Group, 2013, The Global Indigenous Data Alliance, (https://www.gida-global.org/).
What Does FAIR Data Mean?
The FAIR Guiding Principles for scientific data management and stewardship were published in 2016, a collaboration between academia, industry, funding agencies, and scholarly publishers to create a standard for the accessibility and reusability of data, from both a human and machine perspective. This often relates not only to the data itself, but also to the metadata that comes with it, or the information that gives a description and a context, helping readers to understand and organize it. This includes things like the author, date created, size, and potentially much more.
The principles are represented by the acronym FAIR: Findable, Accessible, Interoperable, and Reusable. But what do those terms actually mean?
To be Findable (by humans and computers), (meta)data should be assigned globally unique and persistent identifiers (such as DOIs), and be richly described by metadata that includes the identifier of its data clearly. Importantly, (meta)data should be registered or indexed in a searchable resource.
To be Accessible, (meta)data should be retrievable by this identifier, using standardized communication protocols (an example being HTTP). These protocols must be freely and openly implementable and should allow for authentication/authorization if needed. Metadata should always remain accessible even if the data itself is not.
To be Interoperable, (meta)data should use common and accessible language for knowledge representation, using vocabulary that follow FAIR principles and including clear, meaningful references to other (meta)data.
To be Reusable, (meta)data should be thoroughly and accurately described with enough information to understand the data being accessed. (Meta)data should be released with clear and accessible usage licensing, origin/history, and attribution requirements. Finally, (meta)data must meet any domain-relevant community standards.
To learn more about these principles, the GoFAIR initiative supports the implementation of FAIR principles, outlining and breaking down each principle in clear, detailed terms.
What about CARE Principles?
The CARE Principles for Indigenous Data Governance were developed by the Global Indigenous Data Alliance (GIDA) in 2019. These were created because the “current movement toward open data and open science does not fully engage with Indigenous Peoples’ rights and interests” (Research Data Alliance International Indigenous Data Sovereignty Interest Group, 2019, p. 1).
This includes the FAIR Data Principles, since they focus on the characteristics of data to emphasize and enable its sharing, but ignore the power differentials and historical contexts in which the data exists. This is especially important as Indigneous Peoples reclaim control over the application and use of Indigneous Data and Indigneous Knowledge for collective benefit.
When working with Indigenous Data, the acronym CARE represents the principles that should be followed. Below is a summary of the principles, but the above link breaks each down into more detail, and is definitely worth reading to fully understand their context and motivation.
Collective (Benefit) - Data ecosystems shall be designed and function in ways that enable Indigenous Peoples to derive benefit and equitable outcomes from Indigneous Data, such as inclusive development, innovation, and citizen engagement.
Authority to Control - Indigenous Peoples’ rights, interests, and authority in Indigneous Data must be recognized and empowered, enabling Indigenous Peoples and governing bodies to determine how Indigenous Peoples are represented and identified within data, in accordance with cultural governance protocols. This also includes Indigenous lands, territories, resources, knowledges and geographical indicators.
Responsibility - Indigenous Data must be connected to relationships built on respect, reciprocity, trust, and mutual understanding, as defined by the Indigenous Peoples to whom those data relate. Information must be shared about how data are used to support Indigenous Peoples’ self-determination and collective benefit through openly available, meaningful evidence. This includes enhancing data literacy and supporting the development of an Indigneous digital infrastructure, able to generate data grounded in the languages, worldviews, and lived experiences of Indigenous Peoples.
Ethics - Indigenous Peoples’ rights and wellbeing should be the primary concern at all stages of the data life cycle and across the data ecosystem. Representation and justice, as well as consideration of potential future use (or harm) should be incorporated. This includes acknowledging the provenance and limitations or obligations for secondary use, especially in issues of consent.
Why are FAIR and CARE Data Principles Important?
Following practices like the FAIR and CARE Data Principles can increase the value and impact of your research, while contributing to sustainable and ethical relationships with research participants.
FAIR Data doesn’t necessarily mean ‘open’ or ‘free.’ The important aspect of ‘Accessible’ is that the conditions required in order to access the data are explained in a way that both a human and machine can understand. It’s perfectly reasonable to expect that users register an account to access a repository, for example. Both private and protected data can be FAIR!
FAIR Data increases the useful life of information (decreasing costs associated with data redundancy), and accelerates and expands others’ research. The value of this effect is all the more clear during a pandemic. Accessible data sharing related to COVID-19 “made it possible to accurately diagnose infections early in the current emergency,” according to the World Health Organization (Moorthy et al., 2020). Increasing the accessibility of data also increases the transparency of research, enhancing public trust, a valuable prospect in increasing data literacy and reducing misinformation.
FAIR Principles concentrate on the data itself, while CARE principles complement and build upon FAIR Data. Focusing solely on increasing shareability of data creates tensions for Indigenous Peoples, who are asserting control over the application and use of Indigneous Data.
CARE Principles are “people and purpose-oriented, reflecting the crucial role of data in advancing Inigenous innovation and self-determination” (Research Data Alliance International Indigenous Data Sovereignty Interest Group, 2019, p. 1).
If you’re working with Indigenous Data, CARE Principles and FAIR Principles should be engaged simultaneously in order to keep in mind both the associated people and their worldviews. However, the CARE Principles approach could be applied to many types of data that are not related to Indigenous Peoples, in order to ensure that researchers work towards avoiding the mistake of ignoring the contexts in which data exist, and create opportunities for self-determination and self-governance within the knowledge economy.
If you’d like to learn more about issues related to responsible and inclusive data management, SFU Library’s Research Data Management (RDM) team provides consultations to all researchers, instructors, and graduate students regarding organizing, handling, and sharing data, as well as ethical data considerations. RDM also has plenty of online resources related to each, including a section on Indigenous Data Sovereignty. You can also learn much more about this respectful research from SFU’s own Indigenous Curriculum Resource Centre.
References:
Moorthy, V., Henao Restrepo, A. M., Preziosi, M.-P., & Swaminathan, S. (2020). Data sharing for novel coronavirus (COVID-19). Bulletin of the World Health Organization, 98(3), 150. https://doi.org/10.2471/BLT.20.251561
Research Data Alliance International Indigenous Data Sovereignty Interest Group. (2019, September). CARE principles for Indigenous data governance. The Global Indigenous Data Alliance. http://gida-global.org/