Enabling Open Science through Research Code: Insights from Episode 4 - Documentation for Research Code

Anelda van der Walt, Jyoti Bhogal, Jenny Wong, Joel Nitta, Mthetho Sovara

Last updated on Feb 19, 2025 4 min read Open Science

Imagine you’ve spent months crafting a complex research software tool, refining algorithms, and running countless simulations. It works brilliantly—but then, six months later, you revisit the code and… you’re lost. Sounds familiar? You’re not alone. Experts shared their experiences and strategies for writing good documentation in the fourth episode of the Research Software Community Conversations Series titled Enabling Open Science through Research Code. Here’s what we learned from our participants and insightful panellists:

Jenny Wong - Product Manager @ 2i2c (UK/US)
Joel Nitta - Associate Professor @ Chiba University (Japan)
Mthetho Sovara - Centre for High-Performance Computing (CHPC), National Integrated Cyberinfrastructure System (NICIS) CSIR (South Africa)

📌 Why Documentation is Essential

🚀 Boosts Reproducibility: Research software often accompanies papers, but results can be difficult to replicate without documentation. As Jenny pointed out, in fluid dynamics research, well-documented simulation codes gained more adoption than poorly documented alternatives.

🤝 Facilitates Collaboration: Well-documented projects attract more contributors and streamline knowledge transfer. Joel shared his experience in the rOpenSci community, where clear documentation allowed him to take over and improve an existing R package without needing direct input from the original creator.

⏳ Enhances Longevity: Code without documentation quickly becomes unusable—even by its original authors! Mthetho described how well-documented software in weather forecasting led to unexpected collaborations with the South African Weather Service and Ethiopian researchers.

🔓 Promotes Open Science: Accessible research software fosters innovation and inclusion worldwide. Jenny emphasised that accessible documentation enables wider adoption of tools, making scientific computing more democratic.

🔥 Common Challenges & Solutions

⏰ Time Constraints → Write minimal but clear documentation as you code. Joel suggested that legible code, such as meaningful variable names instead of “x” or “dat”, is itself a form of documentation that is even better than comments.

🧩 Lack of Standardisation → Use templates and structured frameworks. Mthetho mentioned that clear, structured guides helped non-computational scientists successfully use OpenFOAM in unexpected fields like environmental science.

🎯 Technical vs. User Documentation → Engage users in documentation efforts for clarity. Jenny highlighted the role of community-driven documentation efforts, such as hackathons, to improve usability.

🛠 Best Practices for Better Documentation

✍️ 1. Write as You Code

Waiting until the end leads to missing details. Add inline comments and README files while developing. Joel emphasised that tools like RMarkdown (and its successor, Quarto) can be used for literate programming—thereby ensuring that the results of code are automatically generated alongside their description.

🔄 2. Differentiate Documentation Types

✔ User-facing documentation: How to install, use, and troubleshoot.

✔ Developer-facing documentation: Comments, API references, contribution guides.

📚 3. Use the Right Tools

📄 Markdown & Jupyter Notebooks – Combine narrative and code.

📝 Quarto & R Markdown – For structured research outputs.

📘 Sphinx & MIST – Python-based documentation frameworks.

🌐 GitHub Wikis & Pages – Host documentation openly.

🤖 AI-based tools (e.g., Cursor, Copilot) – Assist with documentation (but always review!).

🤝 4. Community & Collaboration Matter

💡 Participate in documentation hackathons. Jenny shared how Project Pythia’s hackathons provided a structured way to improve documentation in geospatial science.

🌍 Join communities like Project Pythia, The Carpentries, and Write the Docs.

💰 5. Advocate for Funding & Policy Support

📢 Push for documentation funding in grant proposals.

🏛 Engage with funders to integrate documentation into open science policies.

🚀 Future of Documentation: AI & Automation

🤖 Automated Documentation: AI tools generate initial drafts, saving researchers time. However, while helpful, AI-assisted documentation should always be reviewed for accuracy.

🌍 AI for Translation: Helps make documentation accessible in multiple languages, fostering global inclusivity. Anelda highlighted how AI tools might support research documentation in multilingual communities in the future. For now, we still need a human-in-the-loop translator for translation efforts. Especially for low-resourced languages.

⚠️ Caution Needed: AI-generated content must be reviewed for accuracy and ethical considerations. Jenny raised concerns about biases in AI and the environmental impact of large-scale machine learning models.

🎯 Final Thoughts: Make Documentation a Priority

Great research code deserves great documentation. It makes work reproducible, fosters collaboration, and ensures that software remains usable long after its creator has moved on. By embracing best practices and community support, we can all contribute to a culture where documentation is seen not as a burden—but as a fundamental part of open science.

📢 Explore More:

Access our Resource Sheet, which contains numerous valuable resources shared by the panellists, facilitators, and participants.

View the session recording on YouTube.

This meetup series is a collaboration between Talarify, RSSE Africa, RSE Asia, AREN, and ReSA.

Open Science