Funding for Agri-food Data Canada is provided in part by the Canada First Research Excellence Fund
I left off my last blog post with a question – well, actually a few questions: WHO owns this data? The supervisor – who is the PI on the research project you’ve been hired onto? OR you as the data collector and analyser? Hmmm…… When you think about these questions – the next question becomes WHO is responsible for the data and what happens to it?
As you already know there really are NO clear answers to these questions. My recommendation is that the supervisor, PI, lab manager, sets out a Standard Operating Procedures (SOP) guide for data collection. Yes, I know this really does NOT address the data ownership question – but it does address my last question: WHO is responsible for the data and what happens to it? And let’s face it – isn’t that just another elephant in the room? Who is responsible for making the research data FAIR?
Oh my, have I just jumped into another rabbit hole?
We have been talking about FAIR data, building tools, and making them accessible to our research community and beyond – BUT? are we missing the bigger vision here? I talk to researchers and most agree that they want to make their data FAIR and share it beyond their lab – BUT…. let’s be honest – that’s a lot of work! Who is going to do it? Here, at ADC, our goal is to work with our research community to help them make agri-food data (and beyond) FAIR – and we’ve been creating tools, creating training materials, and now we are on the precipice of changing the research data culture – well I thought we were – and now I’m left wondering – who is RESPONSIBLE for setting out these procedures in a research project? WHO should be the TRUE force behind changing the data culture and encouraging FAIR research data?
Don’t worry – for anyone reading this – we are VERY set and determined to changing the research data culture by continuing to make the transition to FAIR data – easy and straightforward. It’s just an interesting question and one I would love for you all to consider – WHO is RESPONSIBLE for the data collected in a research project?
Till the next post – let’s consider Copyright and data – oh yes! Let’s tackle that hurdle 🙂
…and we’re back to the data ownership quandry…
Just when I think I may have heard all the different types of questions and situations that may arise in the context of data ownership – I hear a new one. When I first heard the situation I’m going to share with you in a moment – I thought nah.. this must be a one-off. But then I heard it again from a different individual and situation – so it MUST be a “thing”! When I’m honest with myself, look back, and contemplate my own situations – I’m left wondering too!!!
So let’s work through a research situation. You have been hired onto a project as a graduate student – working towards your MSc. You’re SO excited and happy about this wonderful opportunity you have. You work with your supervisor and lab group to create the most appropriate experimental design to answer your research question, and begin your data collection. You heard about the Semantic Engine and created your data schema to match your data collection. Two years down the road and you’re ready to move on – your thesis is complete and you’ve graduated. What about your data? What do you do with it?
The BIG question here – WHO owns this data? The supervisor – who is the PI on the research project you’ve been hired onto? OR you as the data collector and analyser? Hmmm…… When you think about these questions – the next question becomes WHO is responsible for the data and what happens to it? I would love to hear what readers think about this? Email me at edwardsm@uoguelph.ca if you have an opinion.
OK what are my thoughts? I’ll let you know on my next blog post 🙂
image created by CoPilot
So we started the data ownership thread with The “Elephant in the Room” post. We started with a fairly clean example but there was definitely room for interpretation – enough for an elephant some may say…. Let’s try another situation and see what everyone’s thoughts are.
Many of us rely on the wonderful internet to find data sources – let’s not discuss what’s happening south of our border – that’s another conversation for another day. So, internet for data – BUT, what about the library? What about those wonderful items we call books? How about reports? Historical reports? Historical census? There is a hidden treasure trove – yes my favourite saying these days – of data – if you take the time to browse the library for reports or other sources of data.
Let’s use an example. Hmm… you just discovered that there are published reports from the 1940s and 1950s on books read and associated ratings by book club members from your town library. WOW!!! How cool!!! These reports show tables of book titles, randomized ID for book club members (no names or any personal identifiers), and a rating for each book read by each book club member. Wouldn’t it be cool to create a database with this information so people could search this and use it for maybe research or an essay? You think this would be a great project, so you take it on and create a database with the data contained in these reports. Can you see where I’m going with this???
So.. The library created the report in the 1940s and 1950s – stands to reason they “own” that representation of the data? BUT – you have now created a new representation of the data – it’s old stuff – who cares who owns it???? Nope! We care!!! So – who owns the data in this situation?? Like my earlier post – we can mull this over and over and talk it through and come up with different answers almost every day!!!
This is why – we need to discuss who has the rights to access the data? Who can use it? I know everyone wants to talk about ownership – but let’s talk about ethical use of the information – and create data use agreements rather than spin on the Who Owns the Data question?
Well… I’ll stop here for today and come back to this in a couple of weeks. For now – think about who “owns” the data or rather WHO can determine who can access this new representation of the historical book review data?
Thoughts or comments – let me know by sending an email
Alrighty let’s address that proverbial Elephant in the Room – WHO owns the data? This question goes round and round and round – but does it ever land comfortably for everyone? We have been discussing this topic for quite a while here at Agri-food Data Canada, and it keeps cropping up in all the projects we are developing and working with. So let’s take a walk and chat a little about this – no worries there will be a LOT more discussion on this topic – so if you don’t agree or see things differently – let’s chat!
Let’s start with a little example everyone can relate to. I have a small library of books in my home and I’m documenting the title, the author, and my rating of each book I’ve read over the year. I am using Excel to gather this information – so when I’m finished reading my 30 books, I have an Excel file that has titles, authors, ratings for each of the 30 books I’ve read. So far so good? The Excel file I created is mine! I own that representation of the information (titles, authors, ratings of the 30 books) or data. Easy peasy – I, as the creator or originator of the information (data) own that representation (or format if you’d like).
Now, I have a group of friends that also read a number of books over the year, and we all want to pool our information to see if we all rated a book the same way, what books were read, and so on. Yes, we can just get together with our paper lists and talk about what we read and how we rated – but that’s too easy!! And heck we ARE Data GEEKS here and want to create a database with all this information and create some really cool visualizations – ok I’m taking us off track – back to the challenge at hand. Let’s say Mary creates a database and brings in all of our Excel files into that database. A couple of things happen now:
So now what???
This scenario is VERY common in the research data community, and, well in most data communities.
The BIG question and challenge – looking at our situation above – Mary is the creator of the database and “owner” of that representation of the information (data) she gathered – which just happens to include the information you gave her. Now – can Mary do whatever she wants with the information in the database? Can she share to anyone she feels like? Can she massage it, analyse it, publish it? This is where things get GREY and UGLY! In my opinion the answer is a hard NO! Mary NEEDS to have a Data Usage agreement in place with everyone that contributed their information (data) to her database. Yes, my small group of friends asked her to create the database but that doesn’t mean Mary can do what she wants with this information. There was a time when data was not valued as it is today, and that a handshake and a verbal agreement were enough – but today, we need to have a written and signed agreement in place to ensure all parties understand their rights when it comes to the data – how it can be used, shared, analysed, etc….
As you may sense – this topic can and will create tensions – so I think I’ll stop here and pick it up again in the New Year with some sticky situations and the team will show how ADC is working to help improve these conversations by creating a new tool!
Stay tuned for more exciting developments in the New Year!
Photo generated with AI
© 2023 University of Guelph