The R in FAIR

Findable

Accessible (where possible)

Interoperable

Reusable

I believe most of us are now familiar with this acronym?  The FAIR principles  published in 2016.  I have to admit that part of me really wants to create a song around these 4 words – but I’ll save you all from that scary venture.  Seriously though, how many of us are aware of the FAIR principles?  Better yet, how many of us are aware of the impact of the FAIR principles?  Over my next blog posts we’ll take a look at each of the FAIR letters and I’ll pull them all together with the RDM posts – YES there is a relationship!

So, YES I’m working backwards and there’s a reason for this.  I really want to “sell” you on the idea of FAIR.  Why do we consider this so important and a key to effective Research Data Management – oh heck it is also a MAJOR key to science today.

R is for Reusable

Reusable data – hang on – you want to REUSE my data?  But I’m the only one who understands it!   I’m not finished using it yet!  This data was created to answer one research question, there’s no way it could be useful to anyone else!  Any of these statements sound familiar?   Hmmm…  I may have pointed some of these out in the RDM posts – but aside from that – truthfully, can you relate to any of these statements?  No worries, I already know the answer and I’m not going to ask you to confess to believing or having said or thought any of these.  Ah I think I just heard that community sigh of relief 🙂

So let’s look at what can happen when a researcher does not take care of their data or does not put measures into place to make their data FAIR – remember we’re concentrating on the R for reusability today.

Reproducibility Crisis?

Have you heard about the reproducibility crisis in our scientific world?  The inability to reproduce published studies.  Imagine statements like this: “…in the field of cancer research, only about 20-25% of the published studies could be validated or reproduced…”? (Miyakawa, 2020). How scary is that?  Sometimes when we think about reproducibility and reuse of our data – questions that come to mind – at least my mind – why would someone want my data?  It’s not that exciting?  But boys oh boys when you step back and think about the bigger picture – holy cow!!!  We are not just talking about data in our little neck of the woods – this challenge of making your research data available to others – has a MUCH broader and larger impact!  20-25% of published studies!!! and that’s just in the cancer research field.  If you start looking into this crisis you will see other numbers too!

So, really what’s the problem here?   Someone cannot reproduce a study – maybe it’s age of the equipment, or my favourite – the statistical methodologies were not written in a way the reader could reproduce the results IF they had access to the original data.  There are many reasons why a study may not be reproducible – BUT – our focus is the DATA!

The study I referred to above also talks about some of the issues the author encountered in his capacity as a reviewer.  The issue that I want to highlight here is access to the RAW data or insufficient documentation about the data – aha!!  That’s the link to RDM.  Creating adequate documentation about your data will only help you and any future users of your data!  Many studies cannot by reproduced because the raw data is NOT accessible and/or it is NOT documented!

Pitfalls to NO Reusable data

There have been a few notable researchers that have lost their career because of their data or rather lack thereof.  One notable one is Brian Wansink, formerly of Cornell University.  His research was ground-breaking at the time, studying eating habits, looking at how cafeterias could make food more appealing to children, it was truly great stuff!  BUT…..  when asked for the raw data…..  that’s when everything fell apart.  To learn more about this situation follow the link I provided above that will take you to a TIME article.

This is a worst case scenario – I know – but maybe I am trying to scare you!  Let’s start treating our data as a first class citizen and not an artifact of our research projects.  FAIR data is research data that should be Findable, Accessible (where possible), Interoperable, and REUSABLE!  Start thinking beyond your study – one never knows when the data you collected during your MSc or PhD may be crucial to a study in the future.  Let’s ensure it’s available and documented – remember Research Data Management best practices – for the future.

Michelle