Enabling Open Science Through Research Code: Insights from Episode 3 - Opening Up Research Code
In December 2024, we hosted episode four of our six-part series on Enabling Opening Science through Research Code. This time, we focused on openly sharing research code. The conversation provided invaluable insights into best practices, challenges, and the impact of open source in research. Whether you are a researcher, software developer, or just getting started, here are the key takeaways from this inspiring discussion.
Why Open Research Code Matters
Research software plays a crucial role in scientific discovery. Making it openly accessible enhances transparency, reproducibility, and collaboration. The discussion highlighted several reasons why open code can be beneficial to those who develop it as well as the broader research community:
- Reproducibility & Credibility: When researchers share their code, others can validate and build upon it, ensuring scientific integrity.
- Collaboration & Community Growth: Open software enables researchers and developers worldwide to work together, improving tools and methodologies.
- Increased Citations & Impact: Studies with accessible code tend to receive more citations, increasing the visibility and influence of the research.
- Funding & Institutional Support: Many funding agencies now require researchers to share code and data as part of open science initiatives.
Challenges and How to Overcome Them
While making code open has clear benefits, it comes with its own set of challenges. Speakers shared personal experiences and solutions to common obstacles:
- Fear of Criticism: Many researchers worry that sharing their code may invite scrutiny. However, constructive feedback leads to improved code quality.
- Maintaining Code Over Time: Researchers often move on to new projects and cannot maintain old code. One solution is to create a community around a project, inviting contributors to help maintain and improve it.
- Data Privacy & Intellectual Property: Researchers must anonymise datasets or create synthetic versions that simulate actual data while maintaining confidentiality when dealing with sensitive data.
- Documentation & Reusability: A well-documented project is far more valuable to others. Providing README files, usage instructions, and licensing information makes it easier for others to use and contribute.
- We talk more about documentation in Episode 4!
Best Practices for Sharing Research Code
To make code truly open and useful, researchers can consider the following best practices:
- Use Version Control Platforms: Platforms like GitHub, GitLab, and Bitbucket facilitate collaboration, version tracking, and code sharing.
- Choose an Appropriate License: Open source licenses like MIT, Apache 2.0, and GPL define how others can use and modify the code.
- Provide Clear Documentation: A README file explaining installation, usage, dependencies, and citation details makes the code accessible to others.
- Include Citation Information: Using a CITATION.cff file ensures that researchers get credit for their work when their code is used in publications.
- Use Containerisation for Reproducibility: Tools like Docker and GitHub Code Spaces help ensure that code runs the same way on different machines.
- Leverage Community Support: Engaging with communities like Research Software Engineering (RSE) groups, The Carpentries, and others provides learning opportunities and networking.
Diverse Perspectives on Open Science
The discussion brought together speakers from various backgrounds, including bioinformatics, linguistics, software engineering, and illustration. Their stories emphasised that open science is not just for software developers but also for researchers and creatives working in different fields.
- Adeyinka Oresanya, a software developer at CHAOSS in Nigeria
- Chioma Onyido, a bioinformatician from Nigeria, shared how Bioconductor, an open-source project for bioinformatics, has been instrumental in her work.
- Mars Lee, a technical illustrator, highlighted the importance of visual documentation to make open source more accessible.
- Kate Huddlestone, a linguist at Stellenbosch University in South Africa, shared how open-source tools enabled her to work with sign language data, significantly advancing her research.
- Juan Pablo, a senior program manager at GitHub, discussed how GitHub provides tools like citation files and ORCID integration to make research software more discoverable and citable.
Final Advice for Researchers Interested in Open Code
Each speaker shared a key piece of advice for those looking to engage in open research software:
- π€ Join a Community: Engaging with open-source and research communities provides learning and collaboration opportunities.
- π£ Take Small Steps: Even minimal documentation, a simple README, or choosing a license makes a big difference.
- π Leverage Available Resources: Many educational institutions provide free access to GitHub tools and training programs.
- π¨ Donβt Fear Making Mistakes: Sharing imperfect code is better than keeping it inaccessible. The community will help improve it.
Conclusion
Opening research code benefits both individual researchers and the broader scientific community. By embracing best practices and leveraging community support, researchers can enhance collaboration, improve research reproducibility, and increase the impact of their work. Whether you are just starting or looking to improve your open science practices, remember that engagement, documentation, and a willingness to share and learn are the keys to success. For those interested in exploring more resources mentioned during the session, please download our Resource Sheet and view the episode recording.
Remember to sign up for our next episode focusing on funding for research software projects scheduled for 20 March 2025 at 8:30 - 10:00 UTC (your local time).
{The first draft of this post was written by ChatGPT using the transcript of our meetup as input}
This meetup series is a collaboration between Talarify, RSSE Africa, RSE Asia, AREN, and ReSA.