Documenting your work- README: Research Data Management (RDM)

January 5, 2024

Documenting your work- README: Research Data Management (RDM)

Happy New Year everyone!!! Welcome to 2024 – Leap year!!

Oh wow! How time is really flying by! It’s so easy for us to say this and see it happen in our every day lives – BUT – yes it also happens at work and with our research. I remember as a graduate student, in the thick of data collection, thinking I’m never going to finish this project by a given date – there’s too much to do – it’ll never happen! And just like that, whoosh, it’s over and done, and I’ve managed to complete a few research projects since. It’s just amazing how time really does fly by.

As we start a new year with new aspirations, what a great time to implement new habits in our research work! Ah yes, the dreaded documentation piece. Last time we spoke, I talked about variable names and provided you with a list of recommended best practices when creating your variable names for analysis. I also nudged you about keeping those labels, and using the Semantic Engine to create your data schema -check our Carly’s post about Crafting Effective Machine-Actionable Schemas.

So, we have variable names and a data schema, but is that ALL the documentation you should be keeping when you conduct a research project? Of course the answer is NO! Let’s review some other possible documentation pieces and ways to create the documentation.

README file

Let’s tackle the easy piece first and probably the one that will take the longest. A README file is a text file that you should keep in the top folder of your project. Now, let’s first talk about what I mean by a text file. A file created and saved using Notepad on a Windows machine OR TextEdit on a Mac – NOT Word!!! Now I’m sure you’re asking why in the world would I want to use a text editor – a program with NO formatting ability – my document is going to be ugly! Yes it will! BUT – by using a text editor, aka creating a file with a .txt ending will provide you with the comfort that your file will be readable by researchers in the future. Thinking about the Word program as an example, are you 100% positive that the next release will be readable say 5 years from now? Can we read older Word documents today? If you have an older computer with an older version of Word, can you read a document that was created in a newer version of Word? Chances are you’ll have formatting challenges. So…. let’s just avoid that nonsense and use a format that is archivable! .txt

So now that we got that out of the way, what should we include in a README file? Think of the README file as your project or study level documentation. This is where you will describe your folder structure and explain your acronyms. This is also where you will give brief abstract of your study, who the principal investigators are, timeframes, and any information you believe should be passed on to future researchers. Things like challenges with data collection – downpour on day 10 prevented data collection to occur – data collection was conducted by 3 new individuals on day 15, etc… Think about the information that you would find important if YOU were using another study’s data. If you are looking for examples, check out the READMEs in the OAC Historical Research Data and Reproducibility Project dataverse.

The README file is often a skipped yet crucial documentation piece to any project. Some projects use a lab book to capture this information. No matter what media you use the end goal is to capture this information and create a text file for future use.

Conclusion

One more piece of documentation I want to talk about is capturing what happens in your statistical analysis. Let’s leave that for the next post.