Data Ownership

I’ve been trying really hard to – as they say – “Stay in my lane” for these data ownership conversations.  Sticking to the research data front as that is my comfort zone and that’s the world I live and play in.  However, the topic of data ownership goes well beyond research data.  Today, I want to take you on a little journey of historical data and that nasty topic of data ownership.

In one of my previous posts Data Ownership – another quandary to consider… I talked about finding a really cool source of information or data, from the 1940-50 – a fictional situation in that post.  But, I’ll be truthful here – it’s a situation based in reality.   Here at the University of Guelph, our researchers have been conducting research for decades – heck it’s 151 years for our Ontario Agricultural College researchers and 103 years for our Ontario Veterinary College researchers.  That’s a LOT of very interesting research outputs and potentially a lot of fabulous data.  But we know the reality when it comes to data as we see and use it today – check out Historical Data – Where is it?  However, we do have a TON and I mean a TON of research reports that date back to the 1950s – that contain data!  This is where the data archivist in me screams:  Let it out!!!

So let’s talk about our options here.  I’ll use a real example – the Ontario Forage Crops Committee (OFCC) supported and conducted forage crop trials from the early 1950s and published results in an annual report.  These reports have since been scanned, archived, and are available online at their new home under the Ontario Forage Council, since the OFCC no longer exists and has been retired for almost 10 years.   Now these reports contain wonderful data and could show you trends in yields, in the crops used/tested across the years, and the changes in management.  However….  if you want to see these trends – well 60 PDFs here I come!   One of our stakeholders asked whether we could pull the data out of the PDFs and make it usable – oh did I hear you say FAIR???   I jumped on that bandwagon without a second thought and put together a project to extract the data and create a data portal for it.

Can anyone see what I did NOT do??   I got excited! and I LOVE historical data – so yes VERY excited!   BUT…  that little question popped up – Who owns the data?   Do I have the RIGHTS to pull this data out from one representation, create another, and make it available to the world???   Ugly, ugly questions!  I mulled these over for quite some time and convinced myself that since the reports were out in the open, I could do whatever I wanted as long as I cited them – right?  Hmm..  I knew as I was convincing myself that it wasn’t quite right!  Because?  I do NOT own that data or those reports!

Are we extracting the data and creating a portal?   You better believe we are!  How am I justifying this?  I went to the holders or owners of these reports and asked for permission.  Easy as that!  Now this is a great example of accessing historical information  – but I’ll close this post with a more challenging one – one that I am itching to do but will need to work through a few more things – such as copyright.

Back to historical data sources.   Has anyone heard of “The Monthly Bulletin of Agricultural Statistics” published by Dominion of Canada Department of Trade and Commerce Census and Statistics Office?   Oh my – now this is a true treasure trove of data – monthly data such as, Area, Yield, Quality and Value of principal Field Crops In Canada, by province going back to 1916!!!!  This was brought to my attention by a former MSc student from our FARE department who is now doing a PhD in the US and was looking for this data to use in this study.  Oh I would LOVE to pull this data out and make it available to our community.  BUT just BUT – there is a lot to discuss here in regards to ownership and rights.

If you have thoughts and recommendations – or would like to help with this project – please reach out!  I’ll chat about a few of the challenges I forsee next time…

Michelle

 

 

image created by AI

I left off my last blog post with a question – well, actually a few questions:   WHO owns this data?  The supervisor – who is the PI on the research project you’ve been hired onto?  OR you as the data collector and analyser?  Hmmm…… When you think about these questions – the next question becomes WHO is responsible for the data and what happens to it?

As you already know there really are NO clear answers to these questions.   My recommendation is that the supervisor, PI, lab manager, sets out a Standard Operating Procedures (SOP) guide for data collection.  Yes, I know this really does NOT address the data ownership question – but it does address my last question: WHO is responsible for the data and what happens to it?  And let’s face it – isn’t that just another elephant in the room?  Who is responsible for making the research data FAIR?

Oh my, have I just jumped into another rabbit hole?

We have been talking about FAIR data, building tools, and making them accessible to our research community and beyond – BUT?  are we missing the bigger vision here?  I talk to researchers and most agree that they want to make their data FAIR and share it beyond their lab – BUT…. let’s be honest – that’s a lot of work!   Who is going to do it?  Here, at ADC, our goal is to work with our research community to help them make agri-food data (and beyond) FAIR – and we’ve been creating tools, creating training materials, and now we are on the precipice of changing the research data culture – well I thought we were – and now I’m left wondering – who is RESPONSIBLE for setting out these procedures in a research project?  WHO should be the TRUE force behind changing the data culture and encouraging FAIR research data?

Don’t worry – for anyone reading this – we are VERY set and determined to changing the research data culture by continuing to make the transition to FAIR data – easy and straightforward.  It’s just an interesting question and one I would love for you all to consider – WHO is RESPONSIBLE for the data collected in a research project?

Till the next post – let’s consider Copyright and data – oh yes!  Let’s tackle that hurdle 🙂

Michelle

 

 

image created by AI

…and we’re back to the data ownership quandry…

Just when I think I may have heard all the different types of questions and situations that may arise in the context of data ownership – I hear a new one.  When I first heard the situation I’m going to share with you in a moment – I thought nah..  this must be a one-off.  But then I heard it again from a different individual and situation – so it MUST be a “thing”!  When I’m honest with myself, look back, and contemplate my own situations – I’m left wondering too!!!

So let’s work through a research situation.  You have been hired onto a project as a graduate student – working towards your MSc.  You’re SO excited and happy about this wonderful opportunity you have.  You work with your supervisor and lab group to create the most appropriate experimental design to answer your research question, and begin your data collection.   You heard about the Semantic Engine and created your data schema to match your data collection.  Two years down the road and you’re ready to move on – your thesis is complete and you’ve graduated.  What about your data?  What do you do with it?

The BIG question here – WHO owns this data?  The supervisor – who is the PI on the research project you’ve been hired onto?  OR you as the data collector and analyser?  Hmmm…… When you think about these questions – the next question becomes WHO is responsible for the data and what happens to it?   I would love to hear what readers think about this?  Email me at edwardsm@uoguelph.ca if you have an opinion.

OK what are my thoughts? I’ll let you know on my next blog post 🙂

Michelle

 

 

image created by CoPilot

So we started the data ownership thread with The “Elephant in the Room” post.  We started with a fairly clean example but there was definitely room for interpretation – enough for an elephant some may say….  Let’s try another situation and see what everyone’s thoughts are.

Many of us rely on the wonderful internet to find data sources – let’s not discuss what’s happening south of our border – that’s another conversation for another day.   So, internet for data – BUT, what about the library?  What about those wonderful items we call books?  How about reports?  Historical reports?  Historical census?  There is a hidden treasure trove – yes my favourite saying these days – of data – if you take the time to browse the library for reports or other sources of data.

Let’s use an example.  Hmm… you just discovered that there are published reports from the 1940s and 1950s on books read and associated ratings by book club members from your town library.   WOW!!!  How cool!!!  These reports show tables of book titles, randomized ID for book club members (no names or any personal identifiers), and a rating for each book read by each book club member.   Wouldn’t it be cool to create a database with this information so people could search this and use it for maybe research or an essay?  You think this would be a great project, so you take it on and create a database with the data contained in these reports.  Can you see where I’m going with this???

So..   The library created the report in the 1940s and 1950s – stands to reason they “own” that representation of the data?  BUT – you have now created a new representation of the data – it’s old stuff – who cares who owns it????   Nope!  We care!!!  So – who owns the data in this situation??  Like my earlier post – we can mull this over and over and talk it through and come up with different answers almost every day!!!

This is why – we need to discuss who has the rights to access the data?  Who can use it?  I know everyone wants to talk about ownership – but let’s talk about ethical use of the information – and create data use agreements rather than spin on the Who Owns the Data question?

Well…  I’ll stop here for today and come back to this in a couple of weeks.  For now – think about who “owns” the data or rather WHO can determine who can access this new representation of the historical book review data?

Thoughts or comments – let me know by sending an email

Michelle

 

 

image created by AI

Alrighty let’s address that proverbial Elephant in the Room – WHO owns the data?  This question goes round and round and round – but does it ever land comfortably for everyone?  We have been discussing this topic for quite a while here at Agri-food Data Canada, and it keeps cropping up in all the projects we are developing and working with.  So let’s take a walk and chat a little about this – no worries there will be a LOT more discussion on this topic – so if you don’t agree or see things differently – let’s chat!

Let’s start with a little example everyone can relate to.   I have a small library of books in my home and I’m documenting the title, the author, and my rating of each book I’ve read over the year.  I am using Excel to gather this information – so when I’m finished reading my 30 books, I have an Excel file that has titles, authors, ratings for each of the 30 books I’ve read.  So far so good?  The Excel file I created is mine!  I own that representation of the information (titles, authors, ratings of the 30 books) or data.    Easy peasy – I, as the creator or originator of the information (data) own that representation (or format if you’d like).

Now, I have a group of friends that also read a number of books over the year, and we all want to pool our information to see if we all rated a book the same way, what books were read, and so on.  Yes, we can just get together with our paper lists and talk about what we read and how we rated – but that’s too easy!!  And heck we ARE Data GEEKS here and want to create a database with all this information and create some really cool visualizations – ok I’m taking us off track – back to the challenge at hand.  Let’s say Mary creates a database and brings in all of our Excel files into that database.   A couple of things happen now:

  1. Our initial information is now in another format or representation – a database, and
  2. You, as the creator or originator of your ratings, are no longer in possession of all representations of your information (data).

So now what???

This scenario is VERY common in the research data community, and, well in most data communities.

The BIG question and challenge – looking at our situation above – Mary is the creator of the database and “owner” of that representation of the information (data) she gathered – which just happens to include the information you gave her.  Now – can Mary do whatever she wants with the information in the database?  Can she share to anyone she feels like?  Can she massage it, analyse it, publish it?  This is where things get GREY and UGLY!  In my opinion the answer is a hard NO!   Mary NEEDS to have a Data Usage agreement in place with everyone that contributed their information (data) to her database.  Yes, my small group of friends asked her to create the database but that doesn’t mean Mary can do what she wants with this information.  There was a time when data was not valued as it is today, and that a handshake and a verbal agreement were enough – but today, we need to have a written and signed agreement in place to ensure all parties understand their rights when it comes to the data – how it can be used, shared, analysed, etc….

As you may sense – this topic can and will create tensions – so I think I’ll stop here and pick it up again in the New Year with some sticky situations and the team will show how ADC is working to help improve these conversations by creating a new tool!

Stay tuned for more exciting developments in the New Year!

Photo generated with AI

Michelle