RDM in the Real World

Surprise! Surprise!  I’m switching gears a bit for this blog post – off my historical data and data ownership pedestal for a bit 🙂

I want to talk about RDM – Research Data Management – today.  For the past decade I’ve been working with colleagues offering workshops on this topic and working with the Research Data Lifecycle – yup also talked about this in the past:

 

Throughout these posts and the FAIR set of blog posts – I always think to myself – we are NOT teaching anyone anything new or really exciting.  It is more about bringing these challenges to light and nudging everyone to think about RDM when they start their research projects.  At the end of the workshops, I often have students thank me and comment on how funny my examples were and leave.  But, once they get more involved with their projects – that’s when I get the OH! I get it now! and they start from scratch and re-organize their files and project folders.

As ADC matures, we are getting calls from projects to help with their RDM – specifically with the “management” aspect of their data.  Questions are usually to the effect of: we have terabytes of data – what do we do?  This is a basic yet VERY daunting question – let’s be honest!  So let’s work through this together and hopefully some of the tips we re-share here will help.

Organizing your Research Project Data

When organizing a project there are many ways to do this.  Let’s use my bookshelf in my home office as an example.  I can organize my books by author, or I can organize my books by topic, or I can organize my books by frequency of use, or I can organize my books by colour, or….  you get the idea!  How I organize the books on my shelf is really a personal choice and based on how “I” use the books.  Now, let’s turn to how to organize your project data.  Chances are you will have many different views and opinions on how to organize the data.  The project team may consider organizing it by date received, or by instrument used to collect the data, or by individual collecting the data, the options are almost endless.   In my opinion, there are a couple of ways that I think about it:  how the data was collected VS how the data will be used.

In extremely large projects where we have terabytes of data, you should start by asking yourself the very basic question: “How will be use this data?  or How do we anticipate using this data?”  Organizing the data by animal or plot does NOT make sense if you anticipate working with the data across many dates.  So would organizing the data by dates be better?  Let’s be honest – there is NO one way or right way for all!   But, I HIGHLY recommend you and/or your team spend time working through the best organization for your data.

Let me show you WHY this will save you a LOT of time.   Here is an image I took of my files from an old work laptop – eek!  15 years ago!  There are a LOT of problems with how I organized my work laptop at that time.  Now, if I need to find a presentation I did for a conference in 2009 – I will need to open ALL those powerpoint presentations, review the content, rename and place in a more appropriate directory.  Friday_April_11 is in a large unorganized directory just doesn’t work!  What if I need to find that historical polling data from the 1950s?  I would need to open each Excel file, browse the file to determine whether it is the correct one or not.  Psst – it’s the one titled “husbands_fauults_maritalStatus” 😉

a list of files that are NOT organized

In this directory there are only a few files  – so manageable.  BUT imagine this was a directory or folder on your computer with Tbs of your data!  Names are all over the place, since one instrument may provide filenames as INSTR01.dat, one student may name their files as MEdwards_202505.csv, a third researcher may name their files as PROJECT02_data.xlsx.  Without any guidance, everyone places their data files where they think it makes sense – think back to my bookshelf example and you have one big mess – similar to my files back in 2009!

Remember you have Tbs of data!  The time it takes to open every file, review the contents, rename, and move to new organizational structure is time saved IF you decide on an organizational structure at the start of your project!  YES as project management changes, there may also be a re-org of data – but let’s come up with a structure, document it, and leave for the whole team to use!

Sounds easy right??

If you and your team are starting a project and would like to meet with us to help – please send us an email at adc@uoguelph.ca. We are currently working with a couple of larger projects and would love to help you out too!

Michelle